TOKEN  REINFORCEMENT,  CHOICE, 
AND  SELF-CONTROL  IN  PIGEONS 


BY 
KEVIN  D.  JACKSON 


A  DISSERTATION  PRESENTED  TO  THE  GRADUATE  SCHOOL 

OF  THE  UNIVERSITY  OF  FLORIDA  IN  PARTIAL  FULFILLMENT 

OF  THE  REQUIREMENTS  FOR  THE  DEGREE  OF 

DOCTOR  OF  PHILOSOPHY 

UNIVERSITY  OF  FLORIDA 

1993 


ACKNOWLEDGEMENTS 

I  thank  the  members  of  my  Ph.D.  committee,  Marc  Branch, 
Marvin  Harris,  Hank  Pennypacker,  Donald  Stehouwer,  Frans  van 
Haaren  and  especially  my  committee  chairs  Timothy  D. 
Hackenberg  and  E.F.  Malagodi.   Karen  Anderson  provided 
expert  assistance  with  the  figures.   Jeff  Arbuckle  commented 
helpfully  during  the  design  of  the  experiment.   Charlene 
Kruegar  did  most  of  the  initial  subject  training  and 
assisted  with  early  program  writing.   Eric  Jacobs  and  Cindy 
Pietras  often  served  as  surrogate  experimenters,  and  kept 
the  lab  running  through  the  duration.   A  special  thank  you 
goes  to  the  wonderful  people  of  the  Alachua  County 
Association  for  Retarded  Citizens  for  providing  support 
throughout  the  conduct  of  this  study  and  especially  during 
the  write  up.   I  thank  my  family  for  providing  important 
social  contingencies  regarding  my  commitment  to  this 
project.   I  especially  thank  my  wife,  Linda,  and  my 
daughter,  Julie,  for  their  tolerance,  patience,  and  love. 
Finally,  my  thanks  go  to  Metallica  and  Ted  Nugent  for 
setting  such  high  standards  and  for  providing  an  auditory 
context  in  which  to  work. 


TABLE  OF  CONTENTS 


ACKNOWLEDGEMENTS  11 

ABSTRACT iv 

GENERAL  INTRODUCTION  1 

Self -Control  as  Behavior   1 

Individual  and  Cultural  Benefits 

of  Self-Control  3 

Experimental  Analyses  of  Self-Control  7 

Experiments  with  Pigeons  8 

Human  Self-Control  and  Interspecies 

Differences 16 

EXPERIMENT  1 32 

Method 3  6 

Subjects 36 

Apparatus 3  6 

Procedure 37 

Results 41 

Discussion 45 

EXPERIMENT  2 67 

Method 67 

Subjects  and  Apparatus  67 

Procedure 67 

Results 69 

Discussion 72 

GENERAL  DISCUSSION  91 

APPENDIX 100 

REFERENCES 102 

BIOGRAPHICAL  SKETCH   109 


Abstract  of  Dissertation  Presented  to  the  Graduate  School 

of  the  University  of  Florida  in  Partial  Fulfillment  of  the 

Requirements  for  the  Degree  of  Doctor  of  Philosophy 

TOKEN  REINFORCEMENT,  CHOICE, 
AND  SELF-CONTROL  IN  PIGEONS 

By 

Kevin  D.  Jackson 

August,  1993 

Chairperson:   Dr.  E.  F.  Malagodi 
Cochair:   Dr.  Timothy  D.  Hackenberg 
Major  Department:   Psychology 

In  a  choice  between  an  immediate  small  reinforcer  and  a 
delayed  large  reinforcer,  an  organism  exhibits  "self- 
control"  if  it  chooses  the  delayed  reinforcer  and 
"impulsiveness"  if  it  chooses  the  immediate  reinforcer. 
Under  such  procedures,  humans  generally  exhibit  self-control 
but  pigeons  usually  respond  impulsively.   Six  pigeons  were 
exposed  to  self-control  procedures  involving  illumination  of 
light-emitting  diodes  (LEDs)  as  a  form  of  token 
reinforcement.   In  a  discrete-trials  arrangement  subjects 
chose  between  1  and  3  LEDs ;  each  LED  was  exchangeable  for 
2-s  access  to  food.   In  Experiment  1,  subjects  responded 
impulsively,  consistent  with  predictions  of  the  ideal 
matching  law  applied  to  LED  reinforcement,  and  with  previous 
findings  in  pigeons.   However,  within-session  patterns  of 
responding  were  more  consistent  with  predictions  of  the 

iv 


ideal  matching  law  applied  to  food  scheduling.   Differences 
in  food  delays  for  the  2  choices,  that  favored  the  small- 
reinforcer  choice,  prevented  a  clear  assessment  of  the  role 
of  LEDs  in  determining  choice.   In  Experiment  2,  the 
relative  influence  of  LEDs  and  food  was  investigated  in  the 
same  subjects  with  delays  to  food  from  either  choice 
response  equal  under  most  conditions,  but  unequal  in  others. 
All  subjects  exhibited  more  self-control  in  Experiment  2 
than  in  Experiment  1.   Four  subjects  preferred  the  delayed 
large  reinforcer  during  an  arrangement  that  closely 
resembled  typical  human  procedures,  suggesting  that  the 
nature  of  the  consequences  of  choice  responding  may  account 
for  previously  reported  differences  in  the  choice  responding 
of  humans  and  pigeons.   Token-reinf orcer  arrangements  may 
promote  self-control  in  a  manner  similar  to  commitment 
procedures.   The  LEDs  probably  functioned  as  conditioned 
reinforcers,  although  their  discriminative  properties  may  be 
more  relevant  to  the  obtained  self-control. 


GENERAL  INTRODUCTION 

Self-Control  as  Behavior 

We  speak  of  self-control  when,  despite  the  presence  of 
contingencies  that  increase  the  likelihood  of  one  class  of 
behavior,  an  individual  engages  in  an  alternative  behavior 
that  is  more  beneficial  in  the  long  run.   For  example, 
choosing  a  piece  of  fruit  from  the  refrigerator,  instead  of 
one's  favorite  pastry,  in  order  to  improve  overall  health. 
Self-control  is  freguently  used,  not  only  as  a  description 
of  a  valued  form  of  behavior,  but  mistakenly  as  an 
internalized  explanation  for  that  behavior.   Unfortunately, 
this  practice  does  little  to  promote  an  understanding  of  the 
origins  and  mechanisms  of  self-control,  and  perpetuates  the 
myth  that  self-control  and  other  behavioral  patterns  are  the 
result  of  inexorably  mysterious  processes. 

Behaviorists  also  recognize  the  importance  of  self- 
control,  not  as  an  internalized  trait,  but  as  behavior  to  be 
explained.   Radical  behaviorists  in  the  tradition  of  B.F. 
Skinner  focus  on  relations  between  historical  and  current 
contextual  factors  and  the  occurrence  of  self-control,  as 
well  as  on  technologies  for  enabling  humans  to  acguire  and 
benefit  from  repertoires  that  are  sensitive  to  long-term 
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consequences.   In  the  seminal  textbook  chapter  on  this  topic 

(Skinner,  1953,  chap.  15),  Skinner  defined  self-control  as 
engaging  in  one  behavior  (controlling  response)  that  alters 
the  occurrence  of  another  behavior  (controlled  response) , 
thereby  producing  a  more  valuable  outcome.   Thus,  the 
controlling  response  of  counting  to  ten  when  angry  may 
decrease  the  probability  of  hitting  someone  (controlled 
response)  thereby  avoiding  the  potentially  aversive 
consequences  of  fighting.   Skinner  also  discusses  varied 
situations  in  which  individuals  produce  or  remove  a 
controlling  stimulus  of  some  response,  in  which  they  change 
the  relationship  between  behavior  and  its  consequences, 
arrange  for  deprivation,  or  manipulate  an  emotional 
variable.   Often,  self-control  involves  the  manipulation  of 
verbal  stimuli:  for  example,  making  and  then  following  a 
list  of  tasks  to  be  completed.   Stating  to  oneself  the 
beneficial  outcome (s)  of  some  behavior--a  rule  about  the 
behavior  and  its  consequences—may  also  exemplify  self- 
control  . 

Recognizing  self-control  as  behavior  may  help  reveal 
the  variables  of  which  self-control  is  a  function.   It  may 
also  yield  important  practical  benefits,  such  as  new  self- 
control  techniques  and  technologies  for  teaching  self- 
control.   Skinner  attributed  much  of  his  own  success  to  the 
use  of  behaviorally  based  strategies  of  self-control 

(Skinner,  1979) ,  and  even  co-authored  a  book  containing 
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self-control  techniques  relevant  to  behavioral  changes 
accompanying  old  age  (Skinner  &  Vaughan,  1983)  .   Others  have 
adopted  Skinner's  strategy,  endorsing  the  application  of 
behavioral  principles  toward  teaching  self-control  (e.g., 
Mahoney  &  Thoresen,  1974;  Runck,  1982;  Stuart,  1977).   Thus, 
self-control  may  proliferate  through  exposure  to 
scientifically  based  rules  about  behavior  and  through  the 
application  of  scientifically  based  technologies. 

Individual  and  Cultural  Benefits  of  Self-Control 
Much  important  human  behavior  can  be  viewed  in  terms 
consistent  with  self-control,  that  is,  operant  behavior 
functionally  related  to  temporally  remote  consequences.   For 
example,  consider  a  person  who  encounters  a  valued  item 
while  shopping,  perhaps  a  stereo  system,  but  lacks  the  cash 
to  purchase  it.   The  person  may  use  a  credit  card,  gaining 
immediate  possession  of  the  stereo,  but  with  the  unfavorable 
remote  consequence  of  less  money  due  to  interest  payments  on 
the  credit  card.   Self-control  is  said  to  occur  when  instead 
of  purchasing  on  credit,  the  person  saves  enough  cash  to  buy 
the  item  directly  at  some  future  time,  thereby  avoiding  the 
added  cost  of  interest  on  money  borrowed.   Techniques  for 
achieving  this  type  of  self-control  may  include  cutting  up 
all  one's  credit  cards  or  only  buying  items  on  a  premade 
shopping  list. 
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At  the  level  of  individual  behavior,  self-control  often 
determines  success  in  life.   An  individual  who  saves  money 
for  greater  long-term  gains  or  studies  now  because  of  job 
opportunities  later  is  likely  to  benefit  in  the  long  run  and 
be  more  successful  over  the  course  of  a  lifetime.   Many 
stories  of  individual  human  success  and  greatness  involve 
forgoing  of  immediate  gains  and  behaving  instead  toward  some 
long-term  objective  such  as  solving  an  important  problem  or 
completing  an  extensive  project. 

Human  cultural  patterns  may  also  be  viewed  in  terms  of 
self-control.   No  single  individual  can  build  a  highway, 
operate  a  manufacturing  plant,  or  cultivate  the  crops 
responsible  for  feeding  a  nation.   Instead,  such  tasks 
reguire  the  collective  behavior  of  many  individuals, 
behavior  that  occurs  because  of  its  relationship  to 
important  deferred  outcomes.   Culture  may  thus  be  viewed  as 
a  system  by  which  human  behavior  (cultural  practices)  is 
collectively  brought  under  control  of  valuable  deferred 
outcomes.   Cultural  evolution  can  be  explained  in  terms  of 
the  relationship  between  cultural  practices  and  important 
outcomes,  particularly  outcomes  involving  increased  energy 
flow,  decreased  reproductive  pressure,  and,  in 
hierarchically  stratified  societies,  differential  advantages 
for  members  of  the  upper  strata  (Harris,  1974,  1977,  1980, 
1981,  1989) .   In  the  case  of  culture,  the  behavior  of  many 
individuals  is  brought  under  the  control  of  remote 


consequences  through  the  arrangement  of  more  immediate 
socially  administered  reinforcement  and  punishment  and 
through  verbal  practices  that  include  rules  relating 
behavior  to  arbitrary,  nonarbitrary,  and  sometimes 
supernatural  consequences  (Glenn,  1985,  1988;  Malott,  1988; 
Skinner,  1953,  1974)  . 

As  important  as  self-control  is  to  human  success,  so  is 
the  failure  to  respond  to  deferred  consequences  at  the  root 
of  many  problems  facing  both  individuals  and  the  cultures  of 
which  they  are  members.   Many  stories  of  individual  human 
failure  involve  "impulsive"  responding  or  behavior 
controlled  by  relatively  immediate  consequences.   An 
individual  behaving  under  control  of  short-term  outcomes, 
for  example,  by  spending  hours  each  day  watching  television 
instead  of  learning  new  job  skills,  by  consuming  goods  and 
services  at  a  rate  in  excess  of  income,  or  by  the  daily 
self-administration  of  drugs,  will  not  fare  well  in  the  long 
run.   A  frightening  implication  of  this  account  of  self- 
control  is  that  as  the  market  place  is  increasingly  flooded 
with  electronic  entertainment  devices,  video  games,  video 
tapes,  advanced  audio  components,  and  other  computerized 
toys  capable  of  providing  hours  of  seemingly  endless 
varieties  of  relatively  immediate  reinforcing  outcomes, 
individuals  may  be  increasingly  less  likely  to  engage  in 
behaviors  related  to  long-term,  individually  beneficial 


consequences,  and  hence,  less  likely  to  succeed  at  life 
(Skinner,  1986) . 

Social  problems  ranging  from  the  AIDS  epidemic,  in 
which  the  more  immediate  reinforcement  of  unprotected  sex 
overrides  the  potentially  lethal  outcome,  to  pollution  and 
the  destruction  of  the  earth's  ozone  layer,  in  which  more 
immediate  financial  gains  outweigh  tremendous  environmental 
costs,  can  be  viewed  as  failures  to  respond  to  important 
deferred  outcomes.   Similarly,  the  growing  national  debt, 
substandard  housing  construction  in  hurricane  prone  areas, 
and  the  needless  depletion  of  natural  resources,  all  involve 
failures  of  deferred  consequences  to  exert  control  over 
current  behavior. 

Although  cultural  evolution  involves  selection  by 
deferred  outcomes,  a  culture  may  also  fail  by  not  responding 
to  even  more  remote  consequences  of  some  of  its  practices 
(Glenn,  1988) .   Indeed,  the  history  of  human  cultural 
evolution  reveals  repeated  cycles  of  adopting  new  modes  of 
production,  momentarily  improving  living  standards,  and 
intensifying  production  until  ecological  limitations  are 
met,  producing  catastrophic  consequences  for  participants  in 
the  culture  (Harris,  1977,  1980).   In  response  to  such 
catastrophes,  a  process  of  radical  transformation  begins, 
new  cultural  practices  are  selected,  and  the  pre-existing 
culture  no  longer  survives.   These  catastrophes  are 
avoidable  by  increasing  investment  in  the  development  and 
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adoption  of  more  efficient  technologies,  adjusting  the  rate 
of  production  intensification,  and  tolerating  sustained,  but 
less  severe,  reductions  in  living  standards.   As  Skinner  put 
it,  "The  evolution  of  culture  is  a  gigantic  exercise  in 
self-control"  (Skinner,  1971,  p.  205) .   In  other  words,  a 
culture  survives  when  it  is  responsive  to  the  remote 
consequences  (reinforcing  and  aversive)  of  its  practices 
(Skinner,  1971,  1981).   Responding  to  deferred  outcomes  is 
thus  at  the  heart  of  behavioral  ethics  and  the  high  value 
placed  on  cultural  survival.   For  all  of  these  reasons, 
self-control  may  be  the  most  important  problem  faced  by  the 
behavioral  and  social  sciences. 

Experimental  Analyses  of  Self-Control 
Experimental  analyses  of  self-control  focus  primarily 
on  the  role  of  procedural  and  historical  factors  on  choices 
of  individual  subjects.   Typically,  concurrent  schedules 
with  two  response  options  are  used  and  each  option  (choice) 
is  associated  with  its  own  reinforcement  schedule 
(Herrnstein,  1961) .   The  experimental  arrangement  for 
studying  self-control  typically  involves  a  choice  between  a 
larger,  delayed  reinforcer  and  a  smaller,  more  immediate 
reinforcer.   Under  these  conditions,  choice  of  the  delayed 
reinforcer  is  defined  as  "self-control"  whereas  choice  of 
the  immediate  reinforcer  is  defined  as  "impulsiveness." 
Investigations  of  self-control  have  focused  on  reinforcement 
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schedule  parameters,  type  of  reinforcement,  degree  of 
deprivation,  experimental  history,  the  availability  of 
different  responses  and  stimuli  during  experimental 
sessions,  and  other  characteristics  of  experimental 
subjects. 
Experiments  with  Pigeons 

Pigeons  have  served  as  subjects  in  most  nonhuman 
studies  of  self-control,  with  access  to  food  (grain)  as  the 
reinforcer  and  key  pecking  as  the  choice  response.   When 
faced  with  a  choice  between  an  immediate  small  reinforcer 
and  a  delayed  larger  reinforcer,  pigeons  almost  invariably 
prefer  the  smaller,  more  immediate  reinforcer  (Ainslie, 
1974;  Logue  &  Pena-Correal ,  1984;  Logue,  Rodriguez,  Pena- 
Correal,  &  Mauro,  1984;  Mazur  &  Logue,  1978;  Rachlin  & 
Green,  1972;  see  review  by  Logue,  1988).   For  example,  Mazur 
and  Logue  (1978)  exposed  4  pigeons  to  a  choice  procedure 
with  31  discrete  choice  trials  per  session.   Reinforcement 
rate  was  held  constant  by  starting  each  trial  1  min  from  the 
onset  of  the  preceding  trial.   Trials  began  with  the 
illumination  of  the  left  and  right  keys,  green  and  red 
respectively.   A  single  peck  on  the  right  key,  fixed-ratio  1 
(FR1) ,  resulted  in  2-s  access  to  grain.   Each  left  keypeck 
produced  a  6-s  delay  period,  followed  by  6-s  access  to 
grain.   All  subjects  preferred  the  immediate  reinforcer, 
pecking  the  right  key  on  nearly  every  trial. 
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Lea  (1979)  demonstrated  that  pigeons  prefer  a  more 
immediate  reinforcer  over  an  equivalent  delayed  reinforcer, 
even  when  rate  of  reinforcer  access  is  greater  when  the 
delayed  reinforcer  is  chosen.   This  demonstrates  the  potent 
effects  of  reinforcement  immediacy,  for  pigeons'  choice 
responding  is  extremely  sensitive  to  rate  of  reinforcer 
access  when  there  is  no  prereinforcer  delay  across 
alternatives  (de  Villiers,  1977).   In  a  related  study, 
Logue,  Smith,  and  Rachlin  (1985)  demonstrated  that  pigeons' 
choices  in  a  self-control  paradigm  were  insensitive  to 
postreinforcer  delay,  except  when  prereinforcer  delays  were 
equal  and  postreinforcer  delays  affected  the  rate  of 
reinforcer  access. 

A  notable  exception  to  the  usual  finding  of 
impulsiveness  in  pigeons  occurs  if  subjects  are  given  an 
opportunity  to  commit  in  advance  to  receiving  the  larger 
delayed  reinforcer  (Rachlin  &  Green,  1972) .   In  Rachlin  and 
Green's  experiment,  five  pigeons  were  first  exposed  to  a 
standard  self-control  arrangement.   Using  a  discrete-trials 
procedure,  a  single  peck  on  a  red  choice  key  produced 
immediate  access  to  2-s  food,  whereas  a  single  peck  on  the 
green  key  produced  4-s  access  to  food  after  a  4-s  delay. 
Within  one  session,  all  subjects  showed  exclusive  preference 
for  the  red  key  (immediate  reinforcer)  that  was  maintained 
throughout  subsequent  exposures  to  this  choice  arrangement. 


10 
Next,  subjects  were  presented  with  a  concurrent  chains 
schedule.   At  the  start  of  each  choice  trial  both  response 
keys  were  illuminated  white  (initial  link)  and  a  fixed-ratio 
(FR)  of  2  5  keypecks,  distributed  in  any  way  between  the  two 
keys,  produced  a  blackout  of  T  seconds.   The  terminal  link 
followed  the  blackout  and  depended  on  the  location  of  the 
2  5th  keypeck.   If  the  2  5th  keypeck  was  on  the  right  key,  the 
terminal  link  consisted  of  the  original  choice  situation 
described  above.   If  the  2  5th  keypeck  was  on  the  left  key, 
only  the  green  key  was  illuminated  in  the  terminal  link,  and 
only  the  larger  delayed  reinforcer  was  available.   The  value 
of  T  was  manipulated  across  experimental  phases.   For  all 
subjects,  the  number  of  large-reinf orcer  choices  (left 
keypecks  during  the  initial  link)  and  entries  into  the 
terminal  link  associated  with  only  the  large  reinforcer 
increased  as  the  value  of  T  was  increased  from  0.5  to  16  s. 
Preference  reversals  occurred  in  3  subjects;  that  is, 
pigeons  that  primarily  pecked  the  right  key  at  shorter 
values  of  T  switched  over  to  the  left  key  as  the  value  of  T 
was  increased.   These  subjects  preferred  the  delayed,  larger 
reinforcer  and  thus  exhibited  self-control,  when  given  an 
opportunity  to  commit  to  that  option  far  enough  in  advance 
of  the  availability  of  the  smaller,  more  immediate 
reinforcer. 

Impulsive  responding  under  the  standard  self-control 
arrangement  and  the  preference  shifts  observed  in  the 
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Rachlin  and  Green  study  are  consistent  with  the  ideal 
matching  law,  an  equation  that  is  useful  for  describing  and 
predicting  pigeon  performance  under  two-component  concurrent 
schedule  arrangements   (Baum  &  Rachlin,  1969;  Herrnstein, 
1970) : 

B,/B2  =  A^/AjjD,. 
In  this  equation  B1  and  B2  represent  the  number  of  responses 
on  alternatives  1  and  2,  respectively,  and  A1 ,  A2,  D1 ,  and  D2 
represent  the  reinforcer  amounts  (A)  and  prereinf orcer 
delays  (D)  associated  with  the  two  options.   According  to 
this  equation,  the  proportion  of  responses  allocated  to  an 
option  is  equal  to  the  relative  reinforcer  value  of  that 
option,  where  reinforcer  value  is  defined  as  the  product  of 
magnitude  and  immediacy  (1/delay)  of  reinforcement.   With 
concurrent  FR1  schedules,  subjects  tend  to  choose  the 
preferred  option  exclusively  (e.g.,  Herrnstein,  1958;  Logue 
&  Pena-Correal ,  1984);  under  such  arrangements  the  matching 
law  is  useful  primarily  as  a  predictor  of  the  direction  of 
preference.   If  the  ratio  B1/B2  is  greater  than  1, 
preference  for  option  1  is  predicted,  and  if  less  than  1, 
preference  for  option  2  is  predicted.   In  Rachlin  and 
Green's  (1972)  initial  procedure,  treating  the  large- 
reinf orcer  choice  as  option  1,  the  ratio  B^/B2   would  be  less 
than  1  (substituting  a  small  nonzero  delay  value  for  the 
small  reinforcer) ,  which  is  consistent  with  the  obtained 
preference  for  the  smaller  immediate  reinforcer.   In  later 
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conditions  the  value  of  T  is  added  to  the  delay  value  of 
each  option;  thus,  as  the  value  of  T  increases  so  does  the 
ratio  of  B.|/B2.   The  increasing  number  of  large-reinf orcer 
choices,  observed  as  T  increased  in  the  Rachlin  and  Green 
experiment,  was  therefore  in  qualitative  agreement  with 
predictions  of  the  ideal  matching  law.   The  matching 
equation  also  predicts  a  preference  reversal,  from  the  small 
reinforcer  to  the  large  reinforcer,  as  the  value  of  T 
increases.   This  occurred  in  3  of  5  subjects  of  the  Rachlin 
and  Green  study  and  has  since  been  replicated  in  many  other 
studies  with  pigeons  as  subjects  (e.g.,  Ainslie,  1974; 
Green,  Fisher,  Perlow,  &  Sherman,  1981;  Navarick  &  Fantino, 
1976) . 

Interestingly,  Logue  and  Pena-Correal  (1985)  found  that 
pigeons'  choices  in  a  self-control  procedure  were  not 
affected  by  changes  in  deprivation.   Four  pigeons  were  each 
deprived  to  65%,  80%,  and  90%  of  their  free-feeding  weights 
and  were  exposed  to  5  different  choice  arrangements  under 
each  deprivation  level.   As  predicted  by  the  matching  law, 
large-reinforcer  choices  increased  as  delays  to  the  small 
reinforcer  approached  the  value  of  the  large-reinforcer 
delay.   The  failure  of  deprivation  to  alter  choice 
responding  suggests  that  deprivation  produces  the  same 
percentage  change  in  the  value  of  each  reinforcer  (Logue, 
1988)  . 
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An  important  exception  to  the  finding  of  impulsiveness 
in  pigeons  and  to  predictions  of  the  ideal  matching  law 
involves  a  fading  procedure  developed  by  Mazur  and  Logue 
(1978).   Two  groups  of  4  pigeons  each  were  studied. 
Subjects  in  the  experimental  group  were  first  exposed  to  a 
discrete-trials  choice  between  2-  or  6-s  access  to  grain 
each  delayed  6-s  from  a  choice.   All  subjects  preferred  the 
large  reinforcer.   Over  the  next  11,000  trials,  the  delay  to 
the  small  reinforcer  was  gradually  reduced  towards  0  s 
(fading) .   Subjects  nearly  always  chose  the  large  reinforcer 
across  conditions  in  which  the  delay  to  the  small  reinforcer 
was  greater  than  3  s,  a  finding  consistent  with  the  matching 
law.   When  the  delay  to  the  small  reinforcer  was  2  s  or 
less,  a  value  at  which  the  matching  law  predicts  exclusive 
preference  for  the  small  reinforcer,  2  subjects  continued  to 
prefer  the  large  reinforcer  and  all  subjects  continued  to 
make  large-reinforcer  choices  at  least  some  of  the  time. 
Subjects  in  the  control  group  were  only  exposed  to  the 
terminal  condition  of  the  experimental  group  and  then  to  a 
condition  in  which  the  small  reinforcer  was  delayed  5.5  s. 
Unlike  subjects  in  the  experimental  group,  these  subjects 
showed  nearly  exclusive  preference  for  the  small  reinforcer 
when  it  was  delivered  immediately,  a  finding  consistent  with 
predictions  of  the  ideal  matching  law.   Logue  and  Mazur 
(1981)  showed  that  the  self-control  observed  in  the  fading 
subjects  partly  depended  on  the  presence  of  stimuli 
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(overhead  lights)  during  the  delay  that  were  differentially- 
associated  with  the  two  choices.   These  stimuli  apparently 
enhanced  the  value  of  the  delayed  larger  reinforcer.   A 
later  study  confirmed  the  effects  of  this  fading  procedure 
on  self-control  in  pigeons.   Using  an  eguation  that  includes 
parametric  estimations  of  sensitivity  to  delays  and  amounts 
of  reinforcement,  it  was  shown  that  the  choices  of  pigeons 
exposed  to  the  fading  procedure  were  more  sensitive  to 
variations  in  reinforcer  amount  than  to  reinforcer  delay 
(Logue  et  al.,  1984). 

Other  exceptions  to  the  adeguacy  of  the  ideal  matching 
law  for  predicting  preference  in  self-control  arrangements 
with  pigeons  include  some  concurrent-chain  schedule 
situations  with  eguivalent  variable-interval  (VI)  schedules 
in  the  initial  links  and  fixed-interval  (FI)  schedules  in 
the  terminal  links.   With  equivalent  VI  schedules  in  the 
initial  link,  responses  are  distributed  across  both  options 
and  terminal  links  are  entered  equally  often  from  either 
option  (Fantino,  1977) .   Relative  response  rate  serves  as 
the  measure  of  preference  under  such  schedules.   Green  and 
Snyderman  (1980)  manipulated  reinforcer  delay  by  altering 
the  length  of  terminal-link   FI  components.   Pigeons  were 
exposed  to  a  choice  between  6-s  access  to  grain  after  a  long 
delay  and  2-s  access  to  grain  after  a  shorter  delay.   When 
the  ratio  of  delays  was  6:1  and  3:1,  preference  for  the 
large  reinforcer  decreased  with  increases  in  the  absolute 
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value  of  the  delays.   With  a  delay  ratio  of  3:2,  the 
relative  rate  of  large-reinforcer  responses  increased  with 
increases  in  delay  values.   Both  of  these  findings  are 
inconsistent  with  matching- law  predictions  of  no  change  in 
preference  when  delay  ratios  are  constant.   Green  and 
Snyderman  also  examined  predictions  of  the  delay-reduction 
hypothesis  (Fantino,  1969,  1977),  a  model  that  bases 
reinforcer  value  on  the  reduction  in  delay  to  food 
associated  with  the  onset  of  terminal  components.   This 
model  is  consistent  with  the  changes  observed  under  delay 
ratios  of  6:1  and  3:2,  but,  like  the  matching  law,  predicts 
no  change  when  the  delay  ratios  are  3:1.   Navarick  and 
Fantino  (1976)  obtained  some  results  consistent  with  both 
the  matching  law  and  the  delay-reduction  model.   When  the 
value  of  the  terminal  link  FI  (delay)  associated  with  the 
small  reinforcer  was  consistently  10  s  shorter  than  the 
large,  the  number  of  large-reinforcer  choices  increased  as 
the  value  of  both  terminal  FIs  increased.   However,  similar 
increases  in  large-reinforcer  choices  occurred  when 
reinforcer  delays  (FI  values)  were  equal,  a  finding 
consistent  with  delay  reduction,  but  not  the  matching  law. 

Grosch  and  Neuringer  (1981)  exposed  pigeons  to  a  series 
of  self-control  arrangements  similar  to  those  used  by 
Mischel  (1974)  with  human  children  as  subjects.   Trial 
durations  alternated  between  5  and  15  seconds;  subjects 
could  wait  until  the  end  of  a  trial  and  receive  a  preferred 
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grain  mixture  or  peck  a  key  during  the  trial  and  receive  an 
egual  amount  of  a  less  preferred  grain.   Grain  preferences 
were  determined  prior  to  the  experiment  by  presenting  both 
grains  at  once  and  observing  which  grain  mixture  was 
consumed  first.   Self-control  was  measured  as  the  time 
subjects  waited  before  responding.   Self-control  was 
influenced  by  a  number  of  variables  that,  more  or  less, 
resembled  those  manipulated  by  Mischel.   (Some  of  Mischel's 
research  is  discussed  below.)   Pigeons  exhibited  less  self- 
control  when  food  was  visible  (although  the  presence  of  food 
increased  self-control  when  key  pecks  were  required  to 
obtain  the  preferred  grain) ,  when  stimuli  correlated  with 
food  (feeder  lights)  were  present,  or  when  food  was 
delivered  immediately  before  choice  trials.   Adding  an 
alternative  response  manipulandum  during  the  delay  increased 
self-control  (see  Logue  &  Pena-Correal ,  1984,  for  a  similar 
finding) .   Prior  reinforcement  of  waiting  increased  self- 
control  and  prior  punishment  of  waiting  decreased  self- 
control.   While  these  findings  illustrate  some  of  the 
commonalities  in  the  choice  responding  of  humans  and  pigeons 
under  self-control  arrangements,  substantial  performance 
differences  have  also  been  observed. 
Human  Self-Control  and  Interspecies  Differences 

In  contrast  to  pigeons,  human  subjects  generally 
exhibit  self-control  in  laboratory  settings  (Logue,  Pena- 
Correal,  Rodriguez,  &  Kabela,  1986).   Logue  et  al .  (1986) 
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exposed  adult  females  to  choices  between  reinforcers  of 
varying  amounts  and  delays,  similar  to  the  choices  given  to 
pigeons  by  Logue  et  al .  (1984).   Subjects  pressed  a  button 
that  delivered  points  exchangeable  for  money  following 
sessions.   Access  to  the  button  was  controlled  by  pushing  a 
rod  to  the  left  or  right  (choice  responses) .   The  first 
experiment  involved  a  discrete  trials  self-control 
procedure.   Unlike  pigeons,  humans  in  this  study  preferred 
the  larger  delayed  reinforcer  over  the  smaller  more 
immediate  reinforcer  in  most  cases,  although  response  bias 
made  it  difficult  to  interpret  the  choices  of  some  subjects. 
During  the  remaining  experiments,  subjects  were  exposed  to 
concurrent  VI  schedules  with  various  arrangements  of  delays 
and  magnitudes  of  reinforcement  for  the  two  options.   When 
faced  with  a  choice  between  a  small,  relatively  immediate 
reinforcer  and  a  larger  delayed  reinforcer,  all  subjects 
made  a  greater  number  of  delayed-reinf orcer  choices  than 
characteristically  made  by  pigeons  or  predicted  by  the  ideal 
matching  law.   In  30  of  38  cases  in  which  the  matching  law 
predicted  preference  for  the  more  immediate  reinforcer,  the 
humans  preferred  the  delayed  reinforcer.   These  findings  are 
consistent  with  many  other  studies  of  human  choice  which 
deviate  from  matching-law  predictions  and  from  the  usual 
pigeon  findings.   Instead  of  matching,  humans'  choices  tend 
toward  maximizing  overall  obtained  reinforcement,  and  are 
less  sensitive  to  the  diminishing  effects  of  delay  on 
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reinforcer  value  (e.g.,  Belke,  Pierce,  &  Powell,  1989;  Flora 
&  Pavlik,  1992;  King  &  Logue,  1987;  Mawhinney,  1982;  Millar 
&  Navarick,  1984;  Navarick,  1986).   There  are  various 
possibilities  for  explaining  the  differences  in  the  choices 
of  humans  and  pigeons,  some  of  which  will  be  reviewed  below. 

Molar  maximization  models  of  choice,  which  assume 
behavior  maximizes  overall  obtained  reinforcement,  are  most 
consistent  with  human  self-control  performance  (e.g., 
Houston  &  McNamara,  1985;  Rachlin,  Battalio,  Kagel,  &  Green, 
1981).   Some  studies,  upon  which  these  models  are  based, 
have  demonstrated  preference  for  a  larger  more  delayed 
reinforcer  by  nonhuman  subjects,  when  such  a  choice 
maximizes  energy  intake  and  minimizes  energy  expenditure; 
procedural  discrepancies,  however,  make  it  difficult  to 
compare  these  findings  directly  with  the  studies  reviewed 
here  (for  further  discussion  see  Logue,  1988) .   From  this 
perspective,  the  failure  of  molar  maximization  models  to 
account  for  pigeons'  performances  under  self-control 
arrangements  is  the  result  of  limitations  on  the  time  frame 
over  which  costs  and  benefits  are  balanced.   Such 
limitations  could  be  argued  for  on  an  evolutionary  basis  or 
could  be  viewed  as  a  result  of  historical  or  procedural 
factors.   Unfortunately,  it  is  unclear  at  present  which  of 
these  variables  is  critical  and  even  whether  pigeon  and 
human  differences  are  best  characterized  in  terms  of 
maximization  models  of  behavior. 
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Performance  differences  between  humans  and  pigeons 
under  self-control  procedures  might  also  result  from  the 
participation  of  human  subjects  in  extensive  verbal 
communities  outside  of  the  laboratory  that,  especially  in 
capitalistic  societies,  are  likely  to  support  adherence  to 
maximization  strategies  differentially  (Mawhinney,  1982) . 
In  addition  to  directly  reinforcing  maximization,  such 
histories  likely  establish  repertoires  of  following 
maximization  rules  and  stating  rules  to  oneself  about  how  to 
respond  in  ways  that  maximizes  reinforcement.   Such  an 
interpretation  is  consistent  with  behavioral  theory 
(Skinner,  1974;  also  see  Home  &  Lowe,  1993,  for  an 
excellent  discussion)  and  is  supported  by  direct  evidence 
that  experimenter  provided  rules  can  influence  responding 
under  experimentally  arranged  contingencies  (Bentall  &  Lowe, 
1987;  Catania,  Matthews,  &  Shimoff,  1982;  Home  &  Lowe, 
1993;  Solnick,  Kannenberg,  Eckerman,  &  Waller,  1980)  and  by 
inferential  evidence  that  self-stated  rules  influence 
responding  during  some  human  experiments  (Baron  &  Galizio, 
1983;  Home  &  Lowe,  1993;  Laties  &  Weiss  1963;  Lippman  & 
Meyer,  1967;  Logue  et  al.,  1986;  Lowe,  Harzem,  &  Bagshaw, 
1978;  Matthews,  Catania,  &  Shimoff,  1985;  Sonuga-Barke,  Lea, 
&  Webley,  1989) .   Sometimes  instructions  explicitly 
encourage  maximization  patterns,  as  in  the  Logue  et.  al. 
study,  in  which  written  instructions  to  the  subjects 
included  the  statement,  "Your  task  is  to  earn  as  many  points 
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as  you  can"  (p.  161) .   That  such  rules  contributed  to  the 
observed  tendency  to  maximize  is  supported  by  post- 
experimental  questionnaires  in  which  subjects  reported  that 
they  were  attempting  to  maximize  the  total  points  earned  and 
that  they  did  this  by  trying  to  time  the  delays  and 
durations  characteristic  of  button  availability.   More 
recently,  similar  correlations  between  human  subjects' 
verbal  reports  and  patterns  of  responding  were  obtained 
under  various  concurrent  schedules  (Home  &  Lowe,  1993). 
The  authors  of  this  study  clarified  how  the  responding  of 
verbal  adult  humans  in  operant  experiments  often  involves  an 
interaction  of  verbal  processes  with  experimental 
contingencies . 

Human  verbal  and  social  histories  are  also  implicated 
in  developmental  studies  of  self-control.   Sonuga-Barke  et 
al.  (1989)  exposed  4,  6,  9,  and  12-year-old  children  to 
choices  between  1  and  3  tokens  exchangeable  for  candy  or 
toys  after  the  session.   Preference  was  assessed  with 
concurrent  VI  schedules  of  block  pressing.   Presses  on  one 
block  produced  a  10-s  delay  and  delivery  of  1  token;  presses 
on  the  alternate  block  resulted  in  3  tokens  after  a  delay 
that  ranged  from  20  to  50  s  across  different  conditions. 
With  these  delay  values,  reinforcement  could  be  maximized  by 
shifting  preference  from  the  large  to  the  small  reinforcer 
as  the  delay  to  the  large  reinforcer  was  increased.   Some  of 
the  4-year-olds  and  all  of  the  12-year-olds  showed  this 
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pattern.   While  the  12-year-olds  showed  dramatic  preference 
shifts,  however,  the  4-year-olds'  shifts  were  from  near 
indifferent  responding  to  preference  for  the  smaller  more 
immediate  reinforcer.   The  4-year-olds  reported  a  strategy 
of  picking  the  large  reinforcer,  although  they  did  not  do  so 
with  any  consistency.   The  12-year-olds  gave  reports  that 
corresponded  to  their  performance  and  indicated  a  strategy 
of  attempting  to  maximize  reinforcement  by  timing  the  delays 
and  counting  tokens.   The  6-  and  9-year-olds  showed 
consistent  preference  for  the  larger  reinforcer  and,  like 
the  12-year-olds,  their  individual  verbal  reports 
corresponded  well  with  their  choice  responding.   The  results 
suggest  a  developmental  sequence  in  which,  between  the  ages 
of  4  and  6,  children  learn  to  wait  for  larger  delayed 
reinforcers,  and  between  the  ages  of  9  and  12,  learn  to 
wait,  or  not  wait,  for  a  larger  reinforcer  depending  on 
overall  obtained  reinforcement.   These  changes  were  likely 
aided  by  accompanying  changes  in  rule  stating  and  rule 
following  repertoires. 

Other  developmental  studies  described  by  Logue  (1988) 
and  Mischel  and  Mischel  (1983) ,  also  implicate  verbal 
processes  in  choice.   In  these  studies,  children  (3  to  12 
years  old)  choose  between  preferred  and  nonpreferred 
edibles.   The  preferred  edible  was  determined  on  the  basis 
of  prior  verbal  reports  of  the  subjects.   During  single- 
trial  experimentation,  subjects  were  instructed  to  wait  for 
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the  experimenter  to  return  to  get  the  preferred  snack  but  to 
signal  for  the  experimenter  to  return  to  get  the  less 
preferred  snack.   The  measure  of  self-control  was  the  time 
spent  waiting  for  the  experimenter  to  return.   Generally, 
the  longer  the  experimenter  was  away,  the  more  likely  it  was 
that  subjects  would  not  wait.   Subjects  were  also  less 
likely  to  wait  for  the  less  preferred  snack  than  the  more 
preferred  snack.   Older  children  were  more  likely  to  wait 
and  to  wait  longer  than  younger  children  (a  similar 
developmental  finding  has  been  reported  by  Burns  &  Powers, 
1975) .   Self-control  is  improved  in  these  studies  when 
subjects  engage  in  distracting  activities  during  the  wait; 
restate  the  rule  about  getting  the  preferred  snack  by 
waiting;  make  general  or  abstract  statements  about  the  task 
(e.g. ,  it  is  good  to  wait) ;  avoid  making  statements  about 
the  taste,  texture,  or  consumable  characteristics  of 
edibles;  and  avoid  looking  at  the  edibles.   Older  children 
are  more  likely  to  describe  and  engage  in  these  strategies 
for  improving  self-control  and  to  prefer  choice  situations 
more  conducive  to  self-control  (e.g.,  situations  in  which 
the  edibles  are  out  of  sight) .   Verbal  reports  may  be  seen 
to  correspond  with  performance  in  these  studies,  in  that 
children  who  report  using  the  above  strategies  usually  wait 
longer,  and  children  who  wait  longer  are  usually  better  at 
describing  these  strategies  for  improving  self-control. 
Waiting  by  children  is  also  increased  when  the  experimenter 
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provides  instructions  describing  successful  waiting 
activities.   Together,  these  findings  suggest  that  the  type 
of  self-verbalizations  determines  the  length  of  waiting  and 
as  children  grow  older  they  become  more  skilled  at  engaging 
in  verbal  strategies  during  the  wait. 

It  is  possible  that  some  self-stated  rules  about 
forthcoming  reinforcers  serve  a  function  analogous  to  the 
overhead  lights  present  during  the  delay  interval  associated 
with  the  larger  reinforcer  in  pigeon  studies  involving  delay 
fading  (Logue  &  Mazur,  1981;  Logue  et  al.,  1984;  Mazur  & 
Logue,  1978) .   With  both  pigeons  and  humans,  events  (lights 
or  rules)  during  the  delay  that  are  differentially 
associated  with  obtaining  the  larger  reinforcer  enhance 
self-control.   These  delay-fading  studies  might  also  relate 
to  pigeon  and  human  self-control  differences,  in  that  human 
adults  are  more  likely  to  have  had  experiences  analogous  to 
the  fading  history  of  the  pigeons  that  demonstrated  more 
self-control.   In  any  case,  results  showing  that  both 
pigeons  and  younger  (less  verbal)  humans  tend  to  respond 
impulsively  in  self-control  situations,  and  that  verbal 
processes  play  a  role  in  human  performance,  strongly  suggest 
that  verbal  history  is  an  important  determinant  of  self- 
control  in  humans. 

Verbal  processes  cannot  explain  all  species  differences 
in  self-control,  however  (van  Haaren,  van  Hest,  &  van  De 
Poll,  1988) .   Van  Haaren  et  al.  investigated  choices  of  male 
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and  female  rats  between  1  and  3  food  pellets.   Presses  on 
the  right  lever  produced  the  larger  (3-pellet)  reinforcer, 
whereas  presses  on  the  left  lever  produced  the  smaller 
(1-pellet)  reinforcer.   When  each  reinforcer  was  preceded  by 
a  6-s  delay,  all  subjects  preferred  the  large  reinforcer. 
When  the  delay  associated  with  the  small  reinforcer  was 
decreased  to  0.1  s,  all  subjects  continued  to  prefer  the 
large  reinforcer.   When  contingencies  associated  with  the 
levers  were  reversed,  most  of  the  subjects  switched  levers 
and  continued  to  prefer  the  larger,  more  delayed  reinforcer. 
In  a  second  experiment  with  different  rats  as  subjects,  the 
small  reinforcer  was  always  delivered  after  a  6-s  delay  and 
the  large  reinforcer  was  delayed  either  9,  15,  24,  or  36  s 
during  different  conditions.   Most  subjects  consistently 
preferred  the  large  reinforcer.   Rats'  choices  in  this  study 
differed  from  those  of  pigeons  under  similar  arrangements, 
more  closely  resembling  human  performance.   Among  the 
interpretations  of  the  differences  between  pigeons  and  rats 
considered  by  van  Haaren  et  al.,  was  the  notion  that 
elicited  key  pecks  might  contribute  to  the  impulsive 
responding  typical  of  pigeons. 

It  is  well  known  that  a  stimulus  paired  with  food 
presentation  will  elicit  stimulus-directed  pecking  in 
pigeons  (Schwartz  &  Gamzu,  1977) .   Poling,  Thomas,  Hall- 
Johnson,  and  Picker  (1985)  demonstrated  that  a  red  key 
paired  with  a  small  reinforcer  (3-s  access  to  grain)  was 
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more  often  the  target  of  elicited  key  pecks  than  a 
simultaneously  presented  blue  key  paired  with  a  larger 
delayed  reinforcer  (9-s  access  to  grain) .   Lopatto  and  Lewis 
(1985)  investigated  the  role  of  elicited  pecks  in  a  single 
key  self-control  arrangement,  in  which  pecking  a  key  during 
periodic  4-s  presentations  produced  a  small  reinforcer  (2-s 
access  to  grain) ,  while  not  pecking  resulted  in  a  larger 
reinforcer  (4-s  access  to  grain)  delivered  after  the  key  was 
darkened.   Subjects  responded  impulsively,  pecking  the  key 
on  95%  of  trials.   When  pecking  no  longer  produced  the  small 
reinforcer  and  canceled  the  large  reinforcer,  pigeons 
continued  to  peck  on  75%  of  key  illuminations,  suggesting 
that  elicited  pecks  also  contributed  to  the  impulsiveness 
observed  in  the  first  procedure.   The  role  of  elicited  key 
pecks  in  standard  two-key  self-control  arrangements  with 
pigeons  has  not  been  determined,  although  the  studies  cited 
here  suggest  that  elicitation  may  add  to  the  impulsiveness 
observed  in  some  of  these  experiments. 

Finally,  procedural  differences  involving  the  nature  of 
the  conseguences  may  contribute  to  the  reported  performance 
differences  between  humans  and  pigeons  in  studies  of  choice 
and  self-control.   In  most  human  experiments,  consequences 
consist  of  points  (token  reinforcers)  that  are  exchangeable 
for  money  some  time  after  the  experimental  session.   Humans 
may  be  more  likely  to  demonstrate  self-control  because  there 
is  no  advantage  to  obtaining  points  quickly,  since  they 
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cannot  be  exchanged  until  the  session  is  over.   Thus,  the 
point  arrangement  characteristic  of  human  studies  may  favor 
maximization  over  the  length  of  the  session.   In  pigeon 
studies,  on  the  other  hand,  the  typical  consequence  is  food, 
an  unconditioned  reinforcer  of  more  immediate  consummatory 
value.   This  arrangement  may  favor  impulsivity.   Consistent 
with  this  interpretation  are  reports  of  impulsiveness  in 
humans  when  food  (Ragotzy,  Blakely,  &  Poling,  1988)  or 
escape  from  unconditioned  aversive  stimuli  (Navarick,  1982; 
Solnick  et  al.,  1980)  are  consequences  of  choice  responding. 

Ragotzy  et  al.  (1988)  demonstrated  impulsiveness  in 
humans  when  food  was  the  consequence  of  choice  responding. 
Severely  retarded  human  adolescents  chose  between  1  and  3 
Cocoa  Puffs.   Choices  were  made  by  touching  one  of  two 
different  colored  cards,  each  associated  with  one  of  the 
reinforcer  options.   All  3  subjects  preferred  the  large 
reinforcer  when  both  reinforcers  were  delivered  immediately, 
but  as  the  delay  to  the  large  reinforcer  was  increased 
across  conditions,  preference  shifted  strongly  in  favor  of 
the  small  reinforcer.   The  human  subjects  in  this  study 
responded  somewhat  differently  than  pigeons,  preferring  the 
large  delayed  reinforcer  under  some  parameters  in  which  the 
matching  law  predicts  strong  impulsiveness.   However,  unlike 
the  human  subjects  in  prototypical  choice  studies,  and  more 
like  pigeons,  these  subjects  failed  to  maximize 
reinforcement,  responded  impulsively,  and  were  sensitive  to 
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the  diminishing  effects  of  reinforcer  delay  on  reinforcer 
value.   In  a  second  phase  of  the  experiment,  the  delay  to 
the  small  reinforcer  was  increased  across  conditions  and 
preference  shifted  back  to  the  large  reinforcer,  a  result 
that  is  also  consistent  with  previous  findings  in  pigeons 
(Green  et  al.,  1981;  Rachlin  &  Green,  1972).   While  the 
Ragotzy  et  al . ,  1988,  study  lends  some  support  to  the  notion 
that  impulsiveness  is  more  likely  with  immediately 
consumable  reinforcers,  the  atypical  impulsive  responding  in 
their  human  subjects  could  also  be  related  to  the  verbal 
deficiencies  characteristic  of  the  severely  retarded. 

Solnick  et  al.  (1980)  investigated  choices  of  female 
college  students  who  solved  math  problems  while  wearing 
headphones.   In  one  condition,  after  15  s  of  exposure  to 
white  noise  (90  dba)  played  through  the  headphones,  subjects 
were  given  a  choice  of  pressing  one  button  that  turned  the 
noise  off  immediately  for  a  short  duration  (90  s)  or 
pressing  an  alternate  button  that  turned  the  noise  off  for  a 
longer  duration  (150  s)  after  a  delay  of  30  s.   Unlike  the 
verbal  human  adults  in  most  studies,  these  subjects 
responded  impulsively,  strongly  preferring  the  immediate 
reinforcer  (noise  termination) .   A  15-s  delay  was  added  to 
both  options  for  a  second  group  of  subjects,  by  scheduling 
the  choice  opportunity  at  the  start  of  each  trial.   Subjects 
exposed  to  this  condition  showed  exclusive  preference  for 
the  larger,  more  delayed  reinforcer,  a  finding  that 
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resembles  previous  reports  with  pigeons  (e.g.,  Green  et  al., 
1981) . 

Negative  reinforcement  by  noise  termination  also 
produced  impulsive  responding  in  adult  human  college 
students  in  a  study  by  Navarick  (1982).   Navarick's  subjects 
increasingly  preferred  the  small  reinforcer  as  the  delay  to 
the  large  reinforcer  was  increased,  preferred  immediate 
reinforcement  over  an  equal  duration  of  delayed 
reinforcement,  and  preferred  a  large  reinforcer  over  a  small 
reinforcer  when  both  were  delivered  immediately. 

Navarick  and  associates  have  also  examined  the  effects 
of  other  reinforcers  with  humans.   Impulsivity  was 
demonstrated  in  at  least  some  of  the  human  subjects  when 
either  access  to  a  video  game  (Millar  &  Navarick,  1984)  or 
slides  of  entertainment  and  sports  personalities  served  as 
choice  consequences  (Navarick,  1986) .   Another  study 
(Navarick,  1985)  examined  choice  when  illumination  of 
indicator  lights  that  the  subjects  were  told  to  react  to 
with  a  "pleasant  feeling"  served  as  consequences  of  choice 
responding.   In  this  case,  subjects  demonstrated  preference 
for  large  over  small  amounts  of  reinforcement  (duration  of 
illumination)  when  no  delays  were  scheduled  for  either 
choice  but  did  not  prefer  the  immediate  to  the  delayed 
reinforcer  when  reinforcer  amounts  were  equal.   This  finding 
raises  the  possibility  that  the  instructions  regarding  the 
point  reinforcers  in  human  self-control  studies  may  also 
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play  a  role  in  the  obtained  insensitivity  to  large- 
reinforcer  delays,  although  insensitivity  of  the  type 
demonstrated  by  Navarick  was  not  apparent  in  the  Logue  et 
al .  (1986)  study.   Together  Navarick' s  work  shows  that  adult 
human  choices  are  generally  more  sensitive  to  differences  in 
reinforcer  amount  than  reinforcer  delay,  and  because  the 
magnitude  and  reliability  of  delay  sensitivity  varied 
considerably  between  the  reinforcer  types  investigated,  that 
gualitatively  different  reinforcers  likely  have  different 
propensities  for  producing  impulsiveness  (Navarick,  1986) . 

In  regards  to  the  present  discussion,  the  finding  of 
impulsiveness  in  many  of  these  studies  when  reinforcers  of 
more  immediate  value  serve  as  conseguences  of  choice,  and 
the  failure  to  show  impulsiveness  in  human  studies  when 
points  serve  as  reinforcers,  further  suggests  that  the 
characteristic  conseguences  of  choice  in  pigeon  and  human 
studies  (food  vs.  points)  may  contribute  to  the 
characteristic  differences  in  choice  and  self-control. 

In  summary,  the  finding  that  pigeons  respond 
impulsively  under  self-control  arrangements  and  that  adult 
humans  typically  demonstrate  self-control  is  often  explained 
in  terms  of  the  verbal  processes  characteristic  of  humans 
(e.g.,  Mawhinney,  1982)  and  the  limited  capacity  for 
temporal  integration  in  pigeons.   The  finding  that  adult 
humans  respond  impulsively  when  negative  reinforcement  or 
access  to  positive  reinforcers  with  more  immediate  value 
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serve  as  choice  consequences  suggests  that  the  type  of 
reinforcement  may  be  involved  in  the  previously  reported 
species  differences.   The  present  experiments  investigated 
this  possibility  with  pigeons  as  subjects,  using  tokens  as 
consequences  of  choice,  responding  in  a  self-control 
arrangement  that  more  closely  resembles  the  typical  human 
paradigm.   Figure  1  illustrates  the  rationale  for  this 
investigation . 
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Figure  1:  A  summary  of  research  findings  in  self-control 
experiments  with  pigeons  and  humans.   The  two  left  quadrants 
show  that  impulsiveness  has  usually  been  found  with  both 
pigeons  and  humans  when  reinforcement  has  immediate  value. 
The  upper  right  quadrant  represents  the  usual  finding  of 
self-control  in  humans  when  token  reinforcement  is  used. ^ 
The  present  experiment  was  conducted  to  provide  information 
for  the  lower  right  quadrant  and  assessed  the  responding  of 
pigeons  with  token  reinforcement. 


EXPERIMENT  1 

The  points  delivered  as  consequences  in  human  operant 
studies  may  be  viewed  as  token  reinforcers  (Gollub,  1977; 
Kelleher,  1958;  Malagodi,  1967).   Token  reinforcers  are 
usually  physical  objects,  delivered  according  to  some 
schedule  of  reinforcement,  that  can  be  exchanged  for  some 
other  (terminal)  reinforcer.   Tokens,  however,  can  be 
defined  more  generally  as  conditioned  reinforcers  "that  the 
organism  may  accumulate  and  later  exchange  for  other 
reinforcers"  (Catania,  1992,  p.  400).   In  token  reinforcer 
arrangements  a  discriminative  stimulus  is  usually  associated 
with  exchange  periods,  during  which  a  specified  "exchange" 
response  involving  the  token (s)  is  followed  by  presentation 
of  the  terminal  reinforcer.   Thus,  the  token  reinforcer 
paradigm  involves  a  schedule  of  token  reinforcement,  a 
schedule  of  exchange  periods  (exchange  schedule) ,  and  a 
schedule  of  reinforcement  of  exchange  responses  by  the 
terminal  reinforcer  (Malagodi,  Webbe,  &  Waddell,  1975; 
Waddell,  Leander,  Webbe,  &  Malagodi,  1972;  Webbe  &  Malagodi, 
1978) .   All  three  schedules  of  the  token  paradigm  are  also 
components  of  the  point  reinforcer  system  used  in  human 
operant  studies.   The  typical  procedural  arrangement  with 
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humans  differs  from  the  token  reinforcer  paradigm,  however, 
in  the  following  3  ways:   (1)  point  delivery  consists  of 
incrementing  a  counter  instead  of  delivering  a  physical 
object;  (2)  the  exchange  response  involves  manipulation  of 
verbal  stimuli  that  correspond  to  points  instead  of 
manipulating  a  token  object  itself;  and   (3)  the  terminal 
reinforcer  consists  of  money  (a  generalized  conditioned 
reinforcer) ,  instead  of  an  unconditioned  reinforcer. 

The  token  reinforcer  arrangement  characteristic  of 
human  studies  may  produce  self-control  in  a  manner  similar 
to  the  commitment  response  procedure  described  earlier 
(Rachlin  &  Green,  1972).   Recall  that  self-control  was 
increased  in  this  study  when  pigeons  were  provided  with  an 
opportunity  for  advance  commitment  to  the  large  reinforcer. 
Similarly,  by  choosing  a  larger  number  of  points  during  the 
session,  humans  are  committing  to  a  greater  amount  of  money 
after  the  session.   In  both  cases,  a  choice  at  time  X 
determines  the  availability  of  reinforcement  at  a  later  time 
(X  +  T) .   With  humans,  in-session  choices  determine  the 
magnitude  of  post-session  (post-T)  monetary  reinforcement. 
For  pigeons,  commitment  responses  within  a  session  determine 
food  availability  after  T  seconds.   Interestingly,  if  the 
matching  law  were  applied  to  humans'  choices  using  the 
delays  and  magnitudes  of  monetary  reinforcement,  preference 
for  the  larger  amount  of  reinforcement  would  be  predicted. 
The  pervasiveness  of  self-control  in  human  subjects  may 
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simply  be  a  replication  of  the  effects  of  scheduling  choices 
far  enough  in  advance  of  the  availability  of  reinforcement 
(e.g.,  Green  et  al . ,  1981). 

The  points  delivered  in  studies  with  humans  might  also 
contribute  to  the  obtained  self-control.   The  correspondence 
of  points  to  the  amount  of  monetary  reinforcement,  resembles 
the  correspondence  of  overhead  lighting  to  the  amount  of 
food  reinforcement  that  was  shown  to  promote  self-control  in 
pigeons  (Logue  &  Mazur,  1981) . 

This  interpretation  de-emphasizes  the  importance  of 
points  and  implies  that  they  are  subordinate  to  the 
scheduling  of  monetary  reinforcement  in  determining  humans' 
choices.   Whether  or  not  this  is  true  of  nonhumans '  choices 
is  not  known.   The  token  reinforcement  schedule,  however,  is 
often  considered  to  be  subordinate  to  the  exchange  schedule: 
the  token  derives  its  reinforcing  function  from  the  terminal 
reinforcer  that  is  available  only  during  exchange  periods. 
Also,  while  patterns  of  token  reinforced  behavior  usually 
resemble  those  characteristic  of  the  token  reinforcement 
schedule,  the  obtained  rate  of  behavior  and  within  session 
changes  in  patterns  and  rates  across  intertoken  intervals, 
are  determined  by  the  exchange  schedule  (e.g.,  Malagodi  et 
al.,  1975;  Waddell  et  al.,  1972;  Webbe  &  Malagodi,  1978). 
An  extreme  example  of  this  is  the  extended  pauses  observed 
under  token  reinforcement  schedules  during  times  and 
stimulus  conditions  most  remote  from  the  exchange  period 
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(e.g.,  Kelleher,  1958;  Malagodi  et  al . ,  1975;  Waddell  et 
al.,  1972;  Webbe  &  Malagodi,  1978). 

The  present  experiment  investigated  pigeons1  preference 
under  a  token  reinforcer  arrangement  similar  to  the  typical 
human  procedure  involving  point  delivery.   Choices  (pecks  on 
lighted  side  keys)  during  discrete  trials  resulted  in  the 
illumination  (delivery)  of  either  1  or  3  LEDs  (tokens) . 
Each  LED  could  be  "exchanged"  for  2-s  access  to  grain  by 
pecking  a  center  key  during  exchange  periods.   Exchange 
periods  were  initially  scheduled  after  each  trial;  the  ratio 
of  trials  to  exchange  periods  was  then  increased  across 
phases  until  a  single  exchange  period  was  scheduled  at  the 
end  of  the  session.   Increasing  this  ratio  in  successive 
phases  was  done  to  encourage  the  development  of  conditioned 
reinforcing  properties  of  the  LEDs  by  initially  providing  a 
strong  correlation  between  LED  presentation  and  food 
availability,  before  gradually  increasing  the  periodicity  of 
exchange  periods.   Gradually  increasing  the  ratio  of  trials 
to  exchange  periods  may  also  minimize  the  response-weakening 
properties  of  increasing  exchange  schedule  values  (Waddell 
et  al . ,  1972).   Also,  the  exposure  of  subjects  to  exchange 
periods  with  increasing  numbers  of  LEDs  to  exchange  periods 
across  phases,  provided  a  rich  history  of  correspondence 
between  LEDs  and  the  number  of  food  deliveries  available. 
Thus,  the  correspondence  of  LEDs  to  food  amounts  resembled 
the  correspondence  of  points  to  money  amounts  in  human 
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studies.   Finally,  if  changing  the  exchange  schedule  was 
analogous  to  the  manipulation  of  temporal  variables,  (T,  as 
discussed  above) ,  then  under  conditions  with  choices  between 
1  immediate  LED  and  3  delayed  LEDs,  preference  for  the 
larger  delayed  reinforcer  (3  LEDs)  might  be  expected  to 
increase  as  the  ratio  of  trials  to  exchange  periods 
increased,  that  is,  as  choice  responses  became  increasingly 
remote  from  food  availability. 

Method 
Subjects 

Six  experimentally  naive  male  White  Carneau  pigeons 
(Columba  livia)  served  as  subjects.   All  subjects  were 
individually  housed  with  water  and  health  grit  continuously 
available.   Subjects  were  maintained  at  80%  of  their 
laboratory  free-feeding  weight. 
Apparatus 

A  standard  3-key  pigeon  chamber  (Lehigh  Valley)  with  a 
modified  stimulus  panel  served  as  the  experimental  space.   A 
minimum  force  of  0.14  N  was  required  to  activate  either  side 
key  and  a  minimum  force  of  0.12  N  activated  the  center  key. 
Thirty-four  red  light-emitting  diodes  (LEDs)  were  recessed 
in  the  panel,  forming  a  horizontal  row  5  cm  below  the 
ceiling  and  0.7  cm  below  the  houselight  fixture  (see  Figure 
2).   The  LEDs  were  evenly  spaced  and  centered  1.7  cm  from 
each  end  of  the  panel.   Unless  otherwise  indicated,  onset  of 
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LEDs  always  proceeded  sequentially  from  left  to  right  with 
each  onset  accompanied  by  a  brief  tone.   Offset  of  LEDs 
always  proceeded  sequentially  from  right  to  left.   When 
operative,  the  left,  center,  and  right  keys  were  illuminated 
green,  red,  and  blue  respectively.   Primary  reinforcement 
consisted  of  access  to  mixed  grain  through  the  stimulus 
panel  reinforcement  aperture.   During  food  delivery,  all 
key lights  and  the  houselight  were  dark  and  an  orange  light 
above  the  feeder  was  illuminated.   White  noise  was  present 
in  the  experimental  room  to  mask  extraneous  sounds. 
Experimental  contingencies  were  scheduled  and  recorded  by  an 
IBM  286-compatible  computer  with  MED-PC  software. 
Procedure 

Each  subject  was  first  exposed  to  a  one  hour  session  of 
adaptation  with  the  houselight  and  all  LEDs  illuminated  but 
no  other  programmed  contingencies  in  effect.   During 
magazine  training  and  exchange  keypeck  shaping,  the  number 
of  illuminated  LEDs  corresponded  to  the  number  of  food 
deliveries  available.   Magazine  training  sessions  began  with 
the  simultaneous  illumination  of  the  left-most  17  LEDs,  the 
white  houselight,  and  the  red  center  (exchange)  key. 
Intermittent  hopper  presentations  were  controlled  by  a  hand 
held  switch.   When  operated,  the  switch  turned  off  1  LED  and 
0.5  s  later  produced  food.   Alternate  switch  operations 
withdrew  the  hopper.   Magazine  training  ended  when  the 
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subject  ate  readily  from  the  feeder  for  at  least  five 
consecutive  food  deliveries. 

Exchange-keypeck  shaping  began  with  the  same  stimulus 
conditions  as  magazine  training.   Successive  approximations 
to  keypecks  on  the  center  (exchange)  key  produced  offset  of 
1  LED,  followed  0.5  s  later  by  a  2-s  food  delivery.   Once  a 
keypeck  (exchange  response)  occurred,  each  remaining  food 
delivery  of  the  session  reguired  a  single  peck  on  the 
illuminated  exchange  key.   All  subjects  were  then  exposed  to 
two  sessions  of  3  4  LED  exchanges  each,  with  the  same 
contingencies  on  the  exchange  key. 

Choice-key  training  began  with  the  illumination  of  the 
houselight  and  one  choice  key  (left  or  right) .   Each  subject 
was  exposed  to  two  sessions  of  34  food  deliveries  each,  with 
a  different  choice  key  available  in  each  session.   A  single 
peck  on  the  illuminated  choice  key  turned  off  the  key  and 
turned  on  1  LED,  followed  0.1  s  later  by  an  exchange  period, 
signaled  by  illumination  of  the  exchange  key.   A  single  peck 
on  the  exchange  key  turned  off  the  key  and  1  LED,  followed 
0.5  s  later  by  2  s  of  food.   Throughout  the  experiment, 
exchange  periods  remained  in  effect  until  all  illuminated 
LEDs  were  exchanged.   For  one  subject  (1857) ,  who  did  not 
peck  the  choice  key  after  180  minutes  in  the  chamber, 
pecking  was  established  by  reinforcing  successive 
approximations  with  the  onset  of  an  LED  followed  by  the 
exchange  period. 
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Throughout  the  remainder  of  the  experiment,  two 
sessions  were  scheduled  daily,  five  days  per  week,  with  a 
5-min  blackout  between  sessions.   Each  session  consisted   of 
12  discrete  trials,  each  beginning  60  s  from  the  onset  of 
the  preceding  trial,  excluding  exchange  periods.   Failure  to 
respond  for  45  s  on  a  given  trial  delayed  the  onset  of  the 
next  trial  an  additional  60  s.   During  the  intertrial 
interval  (ITI)  the  houselight  and  all  keylights  were  dark. 

The  first  two  trials  of  each  session  were  forced 
exposure  trials,  designed  to  bring  behavior  into  contact 
with  the  consequences  programmed  on  both  keys.   The  key 
available  on  the  first  trial  (left  or  right)  was  determined 
randomly  with  a  probability  of  .5;  the  alternate  choice  key 
was  automatically  illuminated  during  the  second  trial.   The 
contingencies  correlated  with  the  illuminated  key  on  forced- 
choice  trials  corresponded  to  those  in  effect  on  choice 
trials. 

Choice  trials  began  with  the  illumination  of  the 
houselight  and  both  side  (choice)  keys.   A  single  peck  on 
either  side  key  (choice  response)  darkened  both  keys  and 
produced  the  associated  consequences,  the  illumination  of 
either  1  or  3  LEDs.   Large-reinf orcer  choices  resulted  in 
the  illumination  of  3  LEDs — 1  immediate,  the  other  2  spaced 
0.6  s  apart.   Thus,  it  took  1.2  s  to  deliver  3  LEDs.   Small- 
reinforcer  choices  resulted  in  the  immediate  illumination  of 
1  LED. 
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All  subjects  were  initially  exposed  to  a  choice  between 
1  and  3  LEDs,  scheduled  "immediately",  with  an  exchange 
period  following  each  trial  (designated  condition  1) .   The 
large  reinforcer  was  arbitrarily  assigned  to  the  left  key 
for  three  subjects  and  to  the  right  key  for  the  other  three 
subjects  (Table  1) .   This  assignment  was  constant  throughout 
the  experiment.   When  scheduled,  exchange  periods  always 
began  0.1  s  after  the  last  LED  presentation.   Thus,  exchange 
periods  followed  small-reinforcer  (1  LED)  choices  by  0.1  s 
and  large-reinforcer  (3  LED)  choices  by  1.3  s.1 

Next,  subjects  were  randomly  divided  into  two  groups  of 
three  pigeons  each.   For  Group  A,  large-reinforcer  choices 
produced  3  LEDs  after  a  6-s  delay  (condition  ID) .   The  ratio 
of  choice  trials  to  exchange  opportunities  was  then 
increased  to  2:1,  5:1,  and  10:1,  across  conditions  2D,  5D, 
and  10D,  respectively.   For  Group  B,  the  ratio  of  trials  to 
exchange  periods  was  first  increased  from  1:1  to  2:1  to  5:1 
to  10:1,  before  adding  the  6-s  delay  to  the  large  reinforcer 
in  the  final  condition  (10D) .   Figure  3  shows  the  seguence 
of  events  following  large-  and  small-reinforcer  choices. 


The  LEDs  are  spoken  of  in  terms  of  reinforcement 
recognizing  that  strict  behavior  analytic  criteria  for  doing 
so  have  not  been  met.   This  is  done  on  the  basis  of  formal 
similarities  between  the  scheduling  of  LEDs  here  and  the 
scheduling  of  reinforcing  consequences  in  other  studies  and 
for  convenience  when  discussing  and  evaluating  the  role  of 
LEDs  in  the  current  experiment;  this  is  consistent  with 
discussions  of  analogous  consequences  in  human  operant 
studies. 
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LEDs  remained  illuminated  during  the  ITI  after  trials  with 
no  scheduled  exchange  period.   Whenever  the  ratio  of  choice 
trials  to  exchange  periods  was  greater  than  1:1,  only  the 
second  forced  trial  was  followed  with  an  exchange  period. 
Table  1  summarizes  the  experimental  conditions,  order  of 
exposure,  and  number  of  sessions  for  all  subjects. 

Experimental  phases  were  in  effect  for  at  least  20 
sessions  and  until  the  following  stability  criteria  were 
met:  (a)  no  trends  evident  in  the  number  of  choices 
allocated  to  either  alternative  over  the  last  10  sessions 
and  (b)  the  number  of  choices  of  either  option  during  the 
last  5  sessions  not  outside  the  range  of  values  obtained 
during  all  previous  sessions.   Conditions  were  changed 
arbitrarily  if  these  criteria  were  not  met  in  80  sessions. 

Results 
Figure  4  shows  the  number  of  large-reinforcer  choices 
across  all  experimental  conditions.   Data  from  Group  A  are 
displayed  in  the  left  panel  and  Group  B  in  the  right.   The 
bars  are  means  from  the  last  10  sessions  of  each  condition; 
vertical  lines  show  the  range  of  values  used  to  determine 
the  means.   Because  a  session  consisted  of  10  trials,  a 
value  above  5  generally  indicates  preference  for  the  large 
reinforcer,  whereas  a  value  below  5  indicates  preference  for 
the  small  reinforcer.   A  mean  value  between  4  and  6,  with  a 
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range  that  extends  above  and  below  5,  indicates 
indifference. 

Condition  1,  with  no  delay  to  small  or  large 
reinforcers,  resulted  in  strong  preference  for  the  large 
reinforcer  in  5  of  6  subjects;  only  Subject  1857  (Group  A) 
preferred  the  small  reinforcer.   For  the  other  2  subjects  in 
Group  A  (747  and  1383) ,  preference  reversed  in  favor  of  the 
small  reinforcer  when  the  large  reinforcer  was  delayed  6  s 
in  condition  ID.   Large-reinf orcer  choices  also  decreased 
for  Subject  1857  during  this  phase.   All  three  subjects  in 
Group  A  preferred  the  immediate  reinforcer  across  phases  ID, 
2D,  5D,  and  10D.   This  preference  was  generally  strong,  with 
an  average  of  less  than  2  large-reinforcer  choices  per 
session,  except  during  phase  2D  in  which  the  number  of 
large-reinforcer  choices  was  somewhat  elevated  for  Subjects 
1857  and  1383. 

For  subjects  in  Group  B,  scheduling  the  exchange  period 
every  second  choice  trial  reduced  the  number  of  large- 
reinforcer  choices  for  Subjects  1732  and  1855  but  not  for 
Subject  753.   Further  increases  in  the  number  of  trials  per 
exchange  period  during  conditions  5  and  10  shifted 
preference  in  favor  of  the  small  reinforcer  for  Subjects 
1855  and  753.   The  magnitude  of  this  effect  was  greatest  in 
Subject  753,  who  in  the  previous  two  conditions  chose  the 
large  reinforcer  on  nearly  all  trials.   For  Subject  1732, 
preference  for  the  large  reinforcer  was  recovered  during 
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conditions  5  and  10  but  reversed  in  favor  of  the  small 
reinforcer  when  a  delay  to  the  large  reinforcer  was  added  in 
condition  10D.   This  added  delay  also  resulted  in  fewer 
large-reinforcer  choices  for  Subject  753.   In  Subject  1855 
the  number  of  large-reinforcer  choices  increased  slightly 
during  this  condition,  resulting  in  approximate 
indifference. 

Figure  5  shows  within-session  choice  patterns.   The 
relative  freguency  of  large-reinforcer  choices  is  plotted 
across  trials  preceding  scheduled  exchange  periods  over  the 
final  10  sessions  of  each  condition.   Only  data  from 
conditions  in  which  exchange  periods  occurred  after  two  or 
more  trials  are  shown.   As  before,  proportions  above  .5 
indicate  preference  for  the  large  reinforcer  and  proportions 
below  .5  indicate  preference  for  the  small  reinforcer. 

For  subjects  in  Group  A  (left  panels) ,  the  greatest 
proportion  of  large-reinforcer  choices  occurred  during  the 
1st  trial  of  the  block  of  trials  preceding  exchange  periods. 
This  was  consistent  across  subjects  and  conditions,  except 
for  Subject  747  during  condition  10D  in  which  the  proportion 
of  large-reinforcer  choices  varied  unsystematically  across 
the  10  trials.   The  most  pronounced  differential  control  of 
large-reinforcer  choices  by  trial  position  occurred  during 
condition  2D  in  Subject  1383:  the  proportion  of  large 
reinforcers  chosen  was  .82  in  the  1st  trial  but  zero  during 
the  2nd  trial  of  the  block.   For  all  subjects  in  Group  A, 
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during  conditions  2D  and  5D  the  proportion  of  large- 
reinforcer  choices  was  greatest  during  the  1st  trial  and 
dropped  to  zero  or  near  zero  levels  during  the  remaining 
trial (s)  of  a  block.   During  condition  10D,  except  for 
Subject  747,  the  probability  of  a  large-reinf orcer  choice 
decreased  across  trials,  reaching  a  level  of  zero  during  the 
latter  trials  of  the  block. 

Similar,  though  less  pronounced,  effects  occurred  with 
subjects  in  Group  B  (right  panels) .   The  relative  number  of 
large-reinforcer  choices  was  greatest  during  the  initial 
trial  of  the  block  in  8  of  12  cases  for  the  three  subjects 
and  decreased  to  lower  levels  across  remaining  trials. 

Figure  6  shows  average  choice  latencies  during 
conditions  in  which  exchange  periods  were  scheduled  after 
two  or  more  trials,  from  the  last  10  sessions  of  each 
condition.   Latencies  for  subjects  in  Group  A  and  B  are 
displayed  in  left  and  right  panels,  respectively.   Note  that 
the  Y  axes  are  scaled  individually  to  accommodate  between 
subject  differences  in  latencies.   Open  symbols  represent 
latencies  for  large-reinforcer  choices  and  filled  symbols 
for  small-reinf orcer  choices.   The  absence  of  a  data  point 
for  either  choice  denotes  conditions  in  which  choices  of 
that  type  did  not  occur. 

In  38  of  40  cases  across  subjects,  latencies  were 
longest  during  the  1st  trial  of  a  block,  decreasing  across 
trials.   The  lst-trial  latencies  also  tended  to  be  longer  as 
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the  number  of  trials  per  exchange  period  was  increased 
across  conditions.   This  effect  was  clearest  for  Subject 
1857,  in  which  1st  trial  latencies  were  shortest  during 
condition  2D,  somewhat  longer  during  condition  5D,  and 
longest  during  condition  10D.   With  one  exception  (the  2nd 
trial  of  condition  10D  for  Subject  1732)  ,  the  longest 
latency  for  each  subject  occurred  on  the  1st  trial  in 
conditions  with  exchange  periods  scheduled  every  10th  trial. 
Subjects  747  (condition  10D)  and  1855  (condition  10) 
regularly  had  1st  trial  choice  latencies  longer  than  45  s, 
which  postponed  the  onset  of  the  2nd  trial.   No  trend  was 
evident  in  these  latencies  and  they  did  not  systematically 
relate  to  choice. 

Discussion 
In  Experiment  1,  pigeons'  choices  were  assessed  in  a 
self-control  arrangement  with  token-like  reinforcers. 
Despite  the  procedural  similarities  of  this  arrangement  with 
typical  human  procedures,  the  overall  results  of  Experiment 
1  support  previous  findings  with  pigeons  (Logue  et  al., 
1984;  Mazur  &  Logue,  1978)  rather  than  with  humans  (Logue  et 
al.,  1986).   That  is,  subjects  usually  responded 
impulsively,  preferring  the  small  immediate  reinforcer  over 
the  large  delayed  reinforcer  (Figure  4).   Such  impulsive 
responding  is  consistent  with  the  matching  law  applied  to 
LED  reinforcement. 
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Figure  7  shows  matching-law  predictions  of  the  number 
of  large-reinf orcer  choices  based  on  LED  reinforcement.   To 
obtain  meaningful  predictions  a  delay  of  .01  s  was  used 
instead  of  0  s  when  reinforcement  delivery  was  immediate. 
Thus,  both  D.  and  Ds  were  .01  s  when  neither  reinforcer  was 
delayed  (no  delay) ;  when  the  large  reinforcer  was  delayed,  a 
value  of  6  s  was  used  for  DL.   The  reinforcer  amounts  used 
to  calculate  predicted  values  were  3  (AL)  and  1  (As)  for  the 
large  and  small  reinforcers,  respectively  (3  or  1  LEDs) . 
The  relative  number  of  large-reinforcer  choices  predicted  by 
the  matching  law  was  first  calculated,  then  multiplied  by  10 
to  obtain  the  predicted  number  of  large-reinforcer  choices 
out  of  10  trials.   The  matching-law  predictions  correspond 
very  well  to  obtained  data  from  conditions  in  which  the 
large  reinforcer  was  delayed  (ID,  2D,  5D,  and  10D  in  Figure 
4) .   Here,  in  14  of  15  cases  the  small  reinforcer  was 
preferred;  the  only  exception  was  the  indifferent  responding 
of  Subject  1855  under  condition  10D.   The  matching-law 
predictions  correspond  less  well  to  obtained  data  from 
conditions  without  a  reinforcer  delay  (1,  2,  5,  and  10). 
Under  these  conditions  the  large  reinforcer  was  preferred  in 
only  8  of  15  cases.   Six  of  the  7  exceptions  were  from 
conditions  for  Group  B  subjects  in  which  the  number  of 
trials  per  exchange  period  exceeded  one.   LED  reinforcement 
did  not  differ  between  these  conditions,  suggesting  that 
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other  factors  were  responsible  for  the  lower  number  of 
large-reinforcer  choices. 

LED  reinforcement  parameters  also  cannot  account  for 
the  within-session  patterns  of  choice  shown  in  Figure  5. 
For  example,  during  conditions  2  and  2D,  a  greater 
proportion  of  large-reinforcer  choices  occurred  on  the  1st 
trial  of  a  block  than  the  2nd.   In  four  subjects  (1857, 
1383,  1732,  and  1855)  the  large  reinforcer  was  prererred  on 
the  1st  trial  while  the  small  reinforcer  was  preferred  on 
the  2nd  trial.   Also,  during  conditions  in  which  exchange 
periods  were  scheduled  after  5  or  10  trials,  especially  for 
subjects  in  Group  A,  the  proportion  of  large  reinforcers 
chosen  tended  to  be  greatest  during  the  1st  trial,  often 
shifting  downward  abruptly  from  the  1st  to  the  2nd  trial. 

The  within-session  pattern  of  choices  under  conditions 
2  and  2D  is  more  consistent  with  the  predictions  of  the 
ideal  matching  law  applied  to  food  parameters,  than  to  LED 
reinforcement.   Figure  8  shows  matching-law  predictions 
(based  on  food  reinforcement)  of  the  relative  number  of 
large-reinforcer  choices  for  each  trial  in  the  block 
preceding  exchange  periods  under  all  experimental 
conditions.2   Predictions  for  Group  A  are  shown  in  the  top 
graph  and  for  Group  B  in  the  bottom.   The  displayed  values 
are  based  on  food  amounts  and  delays.   The  amounts  used  to 
calculate  these  values  are  based  on  the  total  amount 
(seconds)  of  food  available  during  the  exchange  period 
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following  all  trials  of  a  block.   The  delay  values  are  based 
on  the  minimum  delays  to  the  first  food  delivery  of  an 
exchange  period,  excluding  choice  response  and  exchange 
response  latencies.   For  each  experimental  condition  the 
food-delay  values  on  the  trial  immediately  preceding  an 
exchange  period,  when  LED  reinforcement  is  immediate  for 
both  options,  are  1.8  s  and  0.6  s  for  large-  and  small- 
reinforcer  choices,  respectively.   When  the  large  reinforcer 
is  delayed,  the  food  delay  values  are  7.8  s  and  0.6  s  for 
large-  and  small-reinforcer  choices,  respectively.   Because 
on  all  trials  except  the  trial  just  prior  to  an  exchange 
period  the  delays  to  food  are  equal  for  both  large-  and 
small-reinforcer  choices,  the  matching-law  predictions 
across  these  trials  are  determined  solely  by  amount  of  food 
ratios.   Table  2  shows  the  amount  of  food  values  used  in 
calculating  the  proportions  displayed  in  Figure  8  and  the 
results  of  all  calculations.   The  effect  of  a  choice  in  a 
given  trial  on  the  relative  amount  of  food  obtained  in  a 
subsequent  exchange  period,  depends  on  reinforcer  choices 
during  all  other  trials  of  the  block.   For  this  reason,  the 
predicted  proportion  of  large-reinforcer  choices  for  each 


Although  there  is  no  precedent  for  applying  the 
matching  law  to  food  parameters  in  an  arrangement  like  the 
one  here,  the  matching  law  should  have  relevance  to  the 
present  data.   The  method  of  application  described  and 
presented  here  was  selected  on  rational  but  also  pragmatic 
grounds — it  yielded  results  that  were  consistent  with  the 
obtained  choice  in  the  current  experiments. 
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trial  number  was  determined  by  first  calculating  the  ratio 
assuming  exclusively  small-reinforcer  choices  for  all  other 
trials  of  the  block  and  then  calculating  the  ratio  assuming 
exclusively  large-reinforcer  choices  for  all  other  trials. 
These  two  ratios  establish  the  range  of  predictions  for  a 
given  trial  number  under  a  particular  experimental 
condition.   The  shapes  of  the  obtained  functions  in  both 
calculations  were  the  same  and  the  magnitude  of  difference 
between  the  two  values  on  any  trial  was  always  small.   The 
ratios  were  therefore  averaged  to  obtain  the  values 
displayed  in  Figure  8  (see  the  Appendix  for  complete 
calculation  examples) .   Because  the  experimental  procedure 
involved  concurrent  fixed-ratio  1  schedules,  these 
predictions  were  not  expected  to  provide  precise  estimates 
of  choice  response  ratios  but  rather  to  predict  the 
direction  of  preference;  the  obtained  choice  ratios  would 
thus  be  expected  to  be  more  extreme  than  illustrated. 

During  conditions  1  and  ID  an  exchange  period  occurred 
after  each  trial  so  predictions  are  plotted  only  for  trial 
1,  represented  by  the  symbols  "1"  and  "ID".   Figure  8  shows 
that  the  ideal  matching  law  applied  to  condition  1  predicts 
indifference  between  the  small  and  the  large  reinforcer. 
Figure  4,  however,  shows  that  5  of  6  subjects  strongly 
preferred  the  large  reinforcer  under  this  condition,  whereas 
Subject  1857  preferred  the  small  reinforcer.   The   matching 
law  predicts  strong  preference  for  the  small  reinforcer 
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under  condition  ID  for  Group  A  (top  graph  of  Figure  8) , 
which  is  in  accord  with  the  obtained  preferences  (see  Figure 
4) .   When  the  exchange  period  is  scheduled  after  two  trials, 
the  matching  law  predicts  preference  for  the  large 
reinforcer  on  the  1st  trial  of  the  block  and  the  small 
reinforcer  on  the  2nd.   The  data  shown  in  Figure  5  are  in 
gualitative,  and  sometimes  guantitative,  agreement  with 
these  predictions.   As  predicted,  Subjects  1857  and  1383 
(left  panel) ,  and  1732  and  1855  (right  panel) ,  preferred  the 
large  reinforcer  on  the  1st  trial  of  a  block  and  the  small 
reinforcer  on  the  2nd  trial  of  a  block  of  trials  preceding 
an  exchange  period.   Subject  747  (left  panel)  preferred  the 
small  reinforcer  on  both  trials  but  a  greater  proportion  of 
large-reinforcer  choices  occurred  on  the  1st  trial  than  the 
2nd  trial,  yielding  a  curve  in  the  direction  predicted  by 
the  matching  law.   Data  from  Subject  753  (right  panel)  did 
not  correspond  to  matching-law  predictions;  egually  strong 
preference  for  the  large  reinforcer  was  exhibited  during 
both  the  1st  and  2nd  trials  of  condition  2. 

Predictions  of  the  matching  law  were  less  accurate 
under  conditions  in  which  an  exchange  period  was  scheduled 
following  5  or  10  trials.   Preference  for  the  large 
reinforcer  is  predicted  across  all  but  the  final  trial  of  a 
block,  at  which  point  the  proportion  of  large-reinforcer 
choices  is  predicted  to  drop  steeply  below  .5.   Instead,  the 
proportion  of  large  reinforcers  chosen  tended  to  decrease 
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across  trials  of  a  block,  often  shifting  downward  abruptly 
from  the  1st  to  the  2nd  trial  (see  Figure  5) . 

Given  the  well  established  sensitivity  of  pigeons' 
choices  to  even  small  differences  in  delays  to  food,  it  is 
not  surprising  that  unegual  delays  to  food  also  affected 
responding  in  the  current  experiment.   In  fact,  small 
differences  in  delays  to  food  in  Experiment  1  may  have 
precluded  a  clear  assessment  of  choices  maintained  by  LED 
reinforcement.   For  example,  as  described  earlier,  on  choice 
trials  immediately  preceding  exchange  periods,  under 
conditions  in  which  the  large  reinforcer  was  delayed,  the 
minimum  delays  to  food  were  0.6  s  following  small-reinf orcer 
choices  but  were  7.8  s  following  large-reinf orcer  choices. 
Similarly,  minimum  delays  to  food  on  trials  immediately 
preceding  an  exchange  period  were  1.8  s  and  0.6  s  for  large 
and  small-reinforcer  choices,  respectively,  when  LED 
reinforcement  was  immediate  for  both  options.   These 
different  delays  to  food  were  a  joint  function  of  exchange 
periods  scheduled  immediately  after  LED  presentation,  the 
additional  time  taken  to  illuminate  three  LEDs  in  succession 
following  large-reinf orcer  choices,  and  the  added  delay  to 
the  large  reinforcer  under  conditions  in  which  LED 
presentation  was  delayed.   The  ideal  matching  law, 
established  largely  on  the  basis  of  pigeons'  choices  under 
food  reinforcement  schedules,  applied  to  the  choices  in 
Experiment  1  with  food  parameters,  predicts  preference  for 
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the  small  reinforcer  on  trials  immediately  preceding  the 
exchange  period,  whenever  exchange  periods  are  scheduled 
after  two  or  more  trials  (Figure  8).   Thus,  at  least  on  the 
final  trial  of  a  block,  food-reinforcement  parameters  would 
be  expected  to  have  had  a  greater  influence  on  choices  than 
the  subordinate  LED  arrangements.   This  interpretation  is 
consistent  with  the  choice  patterns  usually  observed  under 
conditions  2  and  2D  (Figure  5) . 

Under  these  conditions,  differences  between  the  1st  and 
2nd  trials  in  delays  to  food  for  the  two  choice  responses 
and  in  stimulus  conditions,  provided  a  basis  for 
discriminative  control  of  choice.   On  the  1st  trial,  with  no 
LEDs  illuminated,  most  subjects  preferred  the  large 
reinforcer.   On  the  2nd  trial,  when  at  least  one  illuminated 
LED  was  always  present  and  food  was  obtained  sooner 
following  a  small-reinforcer  choice,  most  subjects  preferred 
the  small  reinforcer.   Together,  these  results  are  in  accord 
with  the  predictions  of  the  ideal  matching  law  applied  to 
food  delays  (Figure  8)  and  extend  the  generality  of  previous 
findings  regarding  the  importance  of  food  reinforcer  delays 
in  controlling  choice  in  pigeons. 

Although  food-based  ideal  matching-law  predictions 
corresponded  less  well  to  results  from  conditions  in  which 
exchange  periods  occurred  after  5  or  10  trials  (Figures  3 
and  6) ,  stimulus  generalization,  based  largely  on  the 
presence  of  illuminated  LEDs,  might  account  for  some  of 
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these  discrepancies  between  performances  and  matching-law 
predictions.   Recall  that  the  presence  or  absence  of  LEDs 
distinguished  the  1st  and  2nd  trials  of  conditions  2  and  2D. 
During  subsequent  conditions,  when  exchange  periods  were 
scheduled  after  5  or  10  trials,  LEDs  were  illuminated  on  all 
trials  except  the  1st  trial  of  a  block.   The  greater  number 
of  large-reinforcer  choices  on  the  1st  trial  of  a  block 
occurred  in  the  absence  of  illuminated  LEDs,  a  situation 
correlated  with  no  differential  delays  to  food.   On  the 
final  trial  of  a  block,  with  LEDs  present,  food  delays 
favored  small-reinforcer  choices.   Such  control  of  small- 
reinforcer  choices  may  have  generalized  across  earlier 
trials,  with  LEDs  present,  resulting  in  fewer  large- 
reinforcer  choices  than  predicted  by  the  ideal  matching  law. 
The  latency  data  displayed  in  Figure  6  also  support  the 
view  that  the  presence  or  absence  of  LEDs  contributed  to  the 
choice  patterns.   For  both  large-  and  small-reinforcer 
choices,  latencies  were  generally  longest  during  the  1st 
trial  of  a  block,  the  trial  most  temporally  remote  from 
food,  and  the  trial  on  which  the  large  reinforcer  was  most 
preferred.   With  the  exception  of  Subject  1732,  latencies 
were  short  and  nearly  equal  across  the  remaining  trials  in 
which  illuminated  LEDs  were  always  present.   For  Subject 
1732,  latencies  tended  to  decrease  across  trials,  apparently 
under  control  of  the  increasing  proximity  to  food  delivery 
and  perhaps  the  increasing  number  of  LEDs.   That  this 
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pattern  did  not  occur  in  the  other  five  subjects  suggests 
that  control  by  presence  or  absence  of  LEDs  was  greater  than 
control  by  increasing  numbers  of  LEDs  or  by  temporal 
proximity  to  food. 

Interestingly,  the  differential  preference  for  the 
large  reinforcer  on  the  1st  trial  in  the  current  experiment 
may  be  viewed  as  a  kind  of  self-control,  although  it  is  not 
clear  from  the  present  results  if  LEDs  or  food  deliveries 
should  be  treated  as  the  effective  reinforcers.   Of  course, 
both  LED  and  food  parameters  may  have  been  relevant.   The 
relative  influence  of  these  reinforcement  variables  was 
assessed  in  Experiment  2 . 
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Figure  2:   A  diagram  of  the  stimulus  panel  with  the  LEDs. 
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Figure  3:   The  seguence  of  events  following  large-  and 
small-reinforcer  choices  during  conditions  with  (bottom 
panel)  and  without  (top  panel)  a  large-reinf orcer  delay. 


Figure  4.   The  number  of  large-reinforcer  choices  per 
session  across  experimental  conditions.   Data  from  Group  A 
subjects  are  shown  in  the  left  panel  and  Group  B  subjects  in 
the  right  panel.   Values  are  means  from  the  last  10  sessions 
of  each  condition.   Open  bars  indicate  no  delay  to  the  large 
reinforcer  (3  LEDs) .   Striped  bars  indicate  a  6-s  delay  to 
the  large  reinforcer.   Vertical  lines  show  the  range  of 
values  used  to  determine  the  mean. 
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Figure  5.   The  proportion  of  large-reinforcer  choices  at 
each  trial  number  of  a  block  of  trials  preceding  exchange 
periods.   Values  are  derived  from  choice  trials  during  the 
last  10  sessions  of  each  experimental  condition  where  an 
exchange  period  occurred  after  two  or  more  trials.   Data 
from  Group  A  subjects  are  shown  in  the  left  panel  and  data 
from  Group  B  subjects  are  displayed  in  the  right  panel. 
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Figure  6.   Average  choice  latencies  during  conditions  where 
exchange  periods  were  scheduled  after  two  or  more  trials. 
Values  were  derived  from  choice  trials  during  the  last  10 
sessions  of  each  experimental  condition.   Open  symbols 
represent  latencies  for  large-reinf orcer  choices  and  filled 
symbols  indicate  small-reinf orcer  choices.   Data  from  Group 
A  subjects  are  shown  in  the  left  panel  and  data  from  Group  B 
subjects  are  displayed  in  the  right  panel. 
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Figure  7.   The  number  of  large-reinforcer  choices  predicted 
by  the  matching  law  when  the  large  reinforcer  (3  LEDs)  is 
delivered  with  no  delay  (open  bar)  or  with  a  6-s  delay 
(black  bar) .   Values  are  based  on  the  matching  law  applied 
to  the  amounts  and  delays  of  LED  reinforcement. 
Calculations  are  described  in  the  text. 
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Figure  8.   The  proportion  of  large-reinforcer  choices 
predicted  by  the  matching  law  applied  to  food  reinforcement 
for  each  trial  number  of  a  block  of  trials  preceding 
exchange  periods.   Group  A  data  are  displayed  in  the  top 
graph  and  Group  B  data  in  the  bottom  graph.   The  symbols  1 
and  ID  represent  predictions  based  on  conditions  1  and  ID. 
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TABLE  1 

The  experimental  conditions,  order  of  exposure,  and 
number  of  sessions  for  all  subjects  in  Experiment  1. 
Group  A  histories  are  summarized  in  the  top  panel  and 
Group  B  in  the  bottom.   The  key  assigned  to  the  large 
reinforcer  is  indicated  below  each  bird  number. 
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TABLE  2 

The  amount  of  reinforcement  values  and  relative  number  of 
large-reinforcer  choices  predicted  by  the  ideal  matching  law 
for  each  trial  number  of  all  experimental  conditions  in 
Experiment  1.   Values  are  based  on  food  reinforcement.   When 
the  same  trial  number  is  listed  twice,  the  top  listing  shows 
values  when  the  small  reinforcer  is  chosen  on  all  other 
trials  and  the  bottom  listing  shows  values  when  the  large 
reinforcer  is  chosen  on  all  other  trials.   The  mean  values 
displayed  are  the  average  of  these  two  calculations  for  each 
trial  and  correspond  to  the  values  plotted  in  Figure  8 .   The 
food  delay  values  are  described  in  the  text  above. 
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EXPERIMENT  2 

The  purpose  of  Experiment  1  was  to  clarify  the  role  of 
token  reinforcement  in  accounting  for  previously  reported 
differences  in  the  choices  of  pigeons  and  humans. 
Unfortunately,  the  scheduling  of  exchange  periods  allowed 
for  food  delays  to  differ  for  the  2  choices,  which  may  have 
prevented  a  clear  assessment  of  the  relative  influence  of 
LED  versus  food  reinforcement.   To  distinguish  these 
separate  sources  of  reinforcement,  the  major  manipulations 
of  Experiment  1  were  replicated  in  Experiment  2  in  the  same 
subjects,  with  delays  to  food  from  either  choice  response 
equal  under  most  conditions  but  unegual  in  others. 

Method 
Subjects  and  Apparatus 

The  pigeons  from  Experiment  1  served  as  experimental 
subjects.   Housing,  feeding  arrangements,  and  apparatus  were 
the  same  as  in  Experiment  1 . 
Procedure 

Group  and  choice  key  reinforcer  assignments  were  the 
same  as  in  Experiment  1.   All  subjects  were  initially 
exposed  to  an  arrangement  similar  to  condition  1  of 
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Experiment  1  (also  designated  condition  1) ,  except  that  the 
exchange  period  occurred  1.5  s  from  either  choice  response. 
For  subjects  in  Group  A,  the  large  reinforcer  was  then 
delayed  by  6  s ;  the  exchange  period  thus  occurred  7.5  s 
after  a  large-reinforcer  choice  (Dl) .   Beginning  with  the 
next  condition  (EDI),  exchange  periods  were  scheduled  9.5  s 
from  either  choice,  with  no  change  in  LED  presentation. 
Finally,  the  number  of  trials  per  exchange  period  was 
increased  across  conditions  (designated  ED2 ,  ED5,  and  ED10) , 
as  in  Experiment  1 . 

After  condition  1,  Group  B  subjects  were  first  exposed 
to  increases  in  the  ratio  of  trials  to  exchange  periods 
across  conditions  2,  5,  and  10.   Then  a  6-s  delay  was  added 
to  the  large-reinforcer  choice  (D10) .   Under  this  condition, 
the  exchange  period  occurred  7.5  s  after  a  large-reinforcer 
choice  but  still  only  1.5  s  after  a  small-reinf orcer  choice. 
In  the  next  condition  (ED10) ,  the  exchange  period  was 
scheduled  9.5  s  from  either  choice.   All  subjects  in  Group  B 
and  Subject  1857  from  Group  A,  were  next  exposed  to  a 
reversal  of  contingencies  on  the  choice  keys  (RED10) , 
followed  by  a  return  to  the  original  contingencies  (ED10) . 
Experiment  2  conditions  are  summarized  in  Table  3.   Figure  9 
shows  the  seguence  of  events  following  large-  and  small- 
reinf  orcer  choices  during  conditions  with  (bottom  panel)  and 
without  (top  panel)  a  large-reinforcer  delay.   The  values 
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from  conditions  Dl  and  D10  are  shown  in  parenthesis  above 
the  exchange  period  in  the  bottom  panel . 

Results 

Figure  10  shows  the  number  of  large-reinforcer  choices 
across  all  experimental  conditions  for  subjects  in  both 
groups.   All  subjects  strongly  preferred  the  large 
reinforcer  in  condition  1  in  which  neither  reinforcer  was 
delayed  and  the  exchange  period  occurred  1.5  s  after  each 
choice.   For  2  subjects  in  Group  A  (1857  and  747)  , 
preference  reversed  in  favor  of  the  small  reinforcer  when 
the  large  reinforcer  was  delayed  by  6  s  and  the  exchange 
period  occurred  7.5  s  after  a  large-reinforcer  choice  (Dl) . 
Subject  1383 's  performance  was  less  sensitive  to  this 
change,  as  only  a  small  decrease  in  large-reinforcer  choices 
occurred.   During  condition  EDI,  in  which  the  exchange 
period  was  scheduled  9.5  s  from  either  choice,  preference 
for  the  large  reinforcer  was  recovered  in  Subject  1857  but 
not  in  747.    Subject  1383  continued  to  prefer  the  large 
reinforcer  during  this  condition. 

Increasing  the  ratio  of  trials  to  exchange  produced 
different  results  between  subjects  in  Group  A.   The  number 
of  large-reinforcer  choices  decreased  for  Subject  1857, 
resulting  in  indifference  during  conditions  ED2  and  ED5, 
before  increasing  slightly  during  condition  ED10. 
Preference  for  the  large  reinforcer  became  stronger  after 
reversing  the  keys  (RED10)  and  was  maintained  when  the 
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original  contingencies  were  reinstated  in  the  final 
condition.   Subject  1857  was  the  only  subject  in  Group  A  to 
prefer  the  delayed  large  reinforcer  in  the  terminal 
arrangement,  in  which  a  single  exchange  period  was  scheduled 
at  the  end  of  each  session.   Subject  747  was  roughly 
indifferent  during  conditions  ED2  and  ED5  and  preferred  the 
small  reinforcer  during  condition  ED10.   Subject  1383 
preferred  the  small  reinforcer  across  conditions  ED2 ,  ED5, 
and  ED10. 

All  three  subjects  in  Group  B  exhibited  self-control  in 
the  terminal  arrangement  in  which  the  large  reinforcer  was 
delayed,  exchange  periods  were  scheduled  after  10  trials, 
and  there  was  an  equal  delay  to  the  exchange  period  from 
either  choice  (conditions  ED10  and  the  key  contingency 
reversal  condition,  RED10) .   Subjects  1732  and  1855  of  Group 
B  preferred  the  large  reinforcer  across  all  conditions,  even 
when  responding  impulsively  resulted  in  quicker  access  to 
the  exchange  period  (D10) .   For  Subject  753,  preference  for 
the  large  reinforcer  decreased  as  the  ratio  of  trials  to 
exchange  period  was  increased  across  conditions  2,  5,  and 
10,  resulting  in  indifference  during  the  latter  two 
conditions.   Preference  shifted  dramatically  to  the  small 
reinforcer  when  the  large  reinforcer  was  delayed  (condition 
D10)  but  then  reversed  sharply  in  favor  of  the  large 
reinforcer  in  condition  ED10,  in  which  the  time  to  the 
exchange  period  from  either  choice  was  increased  to  9.5  s. 
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Preference  for  the  large  reinforcer  was  maintained  over  the 
next  two  conditions  in  which  the  keys  were  reversed  (RED10) 
and  then  returned  (ED10) . 

Figure  11  shows  within-session  choice  patterns  for 
subjects  in  both  groups.   The  relative  freguency  of  large- 
reinforcer  choices  is  plotted  across  trials  preceding 
scheduled  exchange  periods.   For  Subjects  1732  and  1855,  who 
preferred  the  larger  reinforcer  across  all  conditions,  the 
relative  freguency  of  large-reinforcer  choices  was  high  and 
invariant  across  trial  number.   For  the  other  4  subjects  no 
general  within-session  choice  patterns  were  observed. 
Responding  usually  varied  unsystematically  across  trials  of 
a  block,  although  large-reinforcer  choices  tended  to 
decrease  across  trials  for  Subjects  1857  and  753. 

Figure  12  shows  average  choice  latencies  during 
conditions  in  which  exchange  periods  were  scheduled  after 
two  or  more  trials,  from  the  last  10  sessions  of  each 
condition.   Only  latencies  for  the  preferred  option  of  each 
condition  are  shown,  except  for  Subject  1857  under  condition 
ED5  in  which  each  option  was  chosen  egually  often,  but  only 
large-reinforcer  choice  latencies  are  shown.   Latencies  from 
the  omitted  option  did  not  differ  systematically  from 
preferred  option  latencies  and  showed  the  same  general 
trends. 

In  24  of  32  cases  across  subjects,  latencies  were 
longest  during  the  1st  trial  of  a  block  and  tended  to 
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decrease  across  trials.   Sometimes  latencies  dropped 
abruptly  from  the  1st  to  the  2nd  trial,  as  in  condition  ED5 
for  Subject  1857,  condition  ED5  and  ED10  for  Subject  747, 
and  condition  5  for  Subjects  1732  and  1855.   For  Subject  747 
this  decrease  was  followed  by  an  increase  in  latencies 
across  remaining  trials,  although  latencies  remained  well 
below  their  lst-trial  values.   In  4  of  6  subjects  the 
longest  average  latency  occurred  during  the  1st  trial  under 
a  condition  in  which  the  exchange  period  was  scheduled  after 
10  trials  but  lst-trial  latencies  did  not  generally  increase 
as  the  number  of  trials  per  exchange  period  was  increased. 
With  the  exception  of  Subjects  1383  and  1732,  choice 
latencies  regularly  exceeded  45  s,  especially  during  earlier 
trials  of  a  block,  which  postponed  the  start  of  subsequent 
trials.   No  consistent  trend  occurred  with  these  latencies 
and  they  were  not  systematically  related  to  choice  patterns. 

Discussion 
In  contrast  to  the  results  of  Experiment  1,  self- 
control  was  obtained  in  4  of  the  6  subjects  during  the 
terminal  choice  arrangement  of  Experiment  2  (Figure  10) .   In 
both  experiments,  the  matching  law  applied  to  LED 
reinforcement  predicts  preference  for  the  large  reinforcer 
when  no  delays  are  programmed  for  LED  presentation  but 
preference  for  the  small  reinforcer  whenever  the  large 
reinforcer  is  delayed  by  6  s  (see  Figure  7,  Experiment  1). 
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These  predictions  do  not  differ  between  experiments,  so  they 
cannot  account  for  the  obtained  choice  differences. 
Moreover,  in  Experiment  2,  LED  based  matching-law 
predictions  corresponded  to  choice  responding  in  only  11  of 
20  cases  for  Group  A  subjects  and  11  of  24  cases  in  Group  B 
subjects  (Figure  10) .   The  primary  exceptions  to  these 
matching-law  predictions  in  Experiment  2  were  the  high 
number  of  large-reinf orcer  choices  during  conditions  in 
which  presentation  of  the  3  LEDs  was  delayed. 

The  results  of  Experiment  2  and  the  choice  differences 
between  the  two  experiments  are  more  consistent  with 
matching-law  predictions  derived  from  food-schedule 
parameters,  which  implicate  programmed  delays  to  food  as 
determinants  of  choice.   Figure  13  shows  matching-law 
predictions,  based  on  food-schedule  parameters   of  the 
relative  number  of  large-reinforcer  choices  for  each  trial 
of  a  block  of  trials  preceding  scheduled  exchange  periods 
under  all  experimental  conditions  of  Experiment  2.   Food  was 
delivered  0.5  s  after  an  exchange  response  so  the  food 
delays  used  in  generating  Figure  13  were  0.5  s  greater  than 
the  delays  to  the  exchange  period  listed  in  Table  3. 
Matching-law  predictions  were  determined  for  each  trial  as 
in  Figure  8  and  are  also  displayed  in  Table  4 .   In  Figure 
13,  open  bars  represent  values  during  conditions  in  which 
predictions  do  not  differ  between  trials.   Under  condition 
D10,  the  coarsely  striped  bar  indicates  the  predicted  value 
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for  each  of  the  first  9  trials  of  the  block  and  the  finely 
striped  bar  illustrates  the  matching-law  prediction  for  the 
10th  trial.   Values  on  a  given  trial  depend  partly  on 
choices  during  other  trials  of  a  block  and  error  bars 
indicate  the  range  of  predictions  under  all  possible  choice 
patterns  for  these  other  trials.   The  upper  and  lower  limits 
of  these  bars  indicate  the  predicted  value  when  the  small 
reinforcer  or  large  reinforcer,  respectively,  is  chosen  on 
all  other  trials  of  a  block.   Under  most  conditions,  the 
delays  from  either  choice  to  food  are  egual,  so  preference 
for  the  large  reinforcer  is  predicted;  predictions  also  do 
not  differ  between  trials  but  are  determined  instead  only  by 
differences  in  the  effects  of  the  two  response  options  on 
the  amount  of  food  obtained  during  the  subseguent  exchange 
period.   The  predicted  relative  freguency  of  large- 
reinforcer  choices  decreases  as  the  number  of  trials  per 
exchange  period  increases,  since  the  absolute  amount  of  food 
available  in  the  exchange  period  increases,  and  the  relative 
effect  of  a  single  choice  response  on  the  amount  of  food  in 
the  exchange  period  decreases.   On  all  trials  of  condition 
Dl  and  on  the  10th  trial  of  condition  D10,  minimum  delays  to 
food  differed  between  the  two  response  options,  2  s  for  the 
small-reinforcer  choice  and  8  s  for  the  large-reinforcer 
choice.   This  difference  results  in  a  prediction  of 
preference  for  the  small  reinforcer  on  every  trial  of 
condition  Dl  and  on  the  10th  trial  of  condition  D10. 
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Excluding  condition  D10,  choice  responding  was  in 
agreement  with  food-based  matching-law  predictions  in  3  0  of 
42  cases  (Figure  10) .   Of  the  12  exceptions  to  the  matching 
law,  11  involved  fewer  large-reinf orcer  choices  than 
predicted.   In  most  of  these  cases,  however,  a  greater 
number  of  large-reinf orcer  choices  occurred  than  is 
characteristic  of  pigeons  in  the  more  typical  self-control 
arrangements  (e.g.,  Mazur  &  Logue,  1978).   For  example, 
neither  Subject  1857  nor  747  preferred  the  large  reinf orcer 
during  conditions  ED2  and  ED5  (Group  A,  left  panel) , 
although  both  subjects  chose  the  large  reinf orcer  during 
nearly  half  of  the  trials.   There  also  were  more  large- 
reinforcer  choices  during  these  conditions  than  in  similar 
conditions  (2D  and  5D)  of  Experiment  1  in  which  food  delays 
favored  small-reinf orcer  choices  (Figure  4).   Similar 
results  occurred  with  Subject  753  (Group  B) ,  who  chose  the 
large  reinforcer  more  often  during  conditions  5  and  10  of 
Experiment  2  (Figure  10)  than  in  analogous  conditions  in 
Experiment  1  (Figure  4) .   In  all  subjects,  the  fewer  number 
of  large-reinforcer  choices  sometimes  observed  during 
conditions  in  which  exchange  periods  followed  2  or  more 
trials,  as  compared  with  condition  1,  might  also  result  from 
the  diminishing  relative  influence  of  choices  on  the  amount 
of  food  during  the  exchange  period,  as  reflected  in  the 
decreasing  number  of  large-reinforcer  choices  predicted  by 
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the  matching  law  as  the  number  of  trials  per  exchange  period 
increases  (Figure  13) . 

The  other  (12th)  exception  to  predictions  of  the 
matching  law,  involves  the  large-reinforcer  preference 
exhibited  by  Subject  1383  under  condition  Dl  (Figure  10, 
bottom  left  graph) .   The  insensitivity  to  delay  evidenced 
here  differed  from  the  delay  sensitivity  apparent  in  the 
choice  patterns  of  the  same  subject  during  analogous 
manipulations  of  Experiment  1  (Figure  4,  condition  ID). 
Perhaps,  as  demonstrated  in  other  experiments  (e.g., 
Navarick  &  Fantino,  1976) ,  the  lower  ratio  of  delays  to  food 
from  each  choice  in  Experiment  2  accounts  for  this 
difference.   In  condition  ID  of  Experiment  1,  the  minimum 
delays  to  food  were  7.8  s  from  a  large-reinforcer  choice  and 
0.6  s  from  a  small-reinforcer  choice,  yielding  a  delay  ratio 
of  13:1  (large  choice  delay:small  choice  delay).   In 
condition  Dl  of  Experiment  2,  the  minimum  delays  to  food 
from  these  choices  were  8  s  and  2  s,  respectively,  yielding 
a  delay  ratio  of  4:1.   The  lower  delay  ratio  in  Experiment  2 
leads  to  a  guantitatively  different  matching-law  prediction, 
in  a  direction  favoring  large-reinforcer  choices  (compare 
Figure  8,  condition  ID  to  Figure  13,  condition  Dl) .   Thus, 
although  the  choice  patterns  of  Subject  1383  during 
condition  Dl  differ  guantitatively  from  matching-law 
predictions,  the  choice  differences  between  condition  Dl  of 
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Experiment  2  and  condition  ID  of  Experiment  1,  are  in 
qualitative  agreement  with  the  matching  law. 

The  predominance  of  food  delays  over  LED  delays  was 
clearly  demonstrated  in  the  choice  patterns  of  Subject  1857 
(Figure  10) .   Preference  for  the  small  reinforcer  occurred 
during  condition  Dl,  in  which  presentation  of  the  3  LEDs  was 
delayed  6  s  and  food  could  be  obtained  quicker  by  choosing 
the  small  reinforcer.   When  the  delays  to  food  were  equated 
for  both  options  during  condition  EDI,  however,  preference 
reversed  in  favor  of  the  large  reinforcer,  although  LED 
presentation  continued  to  be  delayed  6  s  following  large- 
reinforcer  choices. 

The  choice  patterns  of  Subject  753  during  conditions 
D10  and  ED10  (Figure  10,  lower  left  graph)  were  also 
strongly  influenced  by  food  delays.   During  condition  D10,  a 
6-s  delay  was  added  to  the  presentation  of  LEDs  following 
large-reinforcer  choices  and  delays  to  food  also  differed 
between  the  two  response  options,  so  food  could  be  obtained 
quicker  by  choosing  the  small  reinforcer  on  the  10th  trial 
of  a  block.   The  shorter  delay  to  food  results  in  a 
matching-law  prediction  of  preference  for  the  small 
reinforcer  on  the  10th  trial  (Figure  13) .   It  was  argued 
earlier  that  similar  delay  differences  in  Experiment  1 
produced  impulsive  responding  on  the  final  trial  of  a  block, 
a  pattern  which  generalized  across  trials.   Consistent  with 
this  interpretation,  Subject  753  showed  exclusive  preference 
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for  the  small  reinforcer  across  the  last  6  trials  of  a  block 
during  condition  D10  and  the  number  of  large-reinforcer 
choices  decreased  across  trials  (Figure  11) .   The  number  of 
large-reinforcer  choices  per  session  also  decreased  during 
condition  D10  (Figure  10) .   The  importance  of  delays  to 
food,  as  opposed  to  delays  in  LED  presentation,  for  this 
subject  was  also  dramatically  illustrated  by  the  increase  in 
large-  reinforcer  choices  during  condition  ED10,  in  which 
food  delays  were  eguated  for  the  two  options  but  the  delay 
associated  with  LED  presentation  was  not  changed. 

Unlike  Subject  753,  Subjects  1732  and  1855  preferred 
the  large  reinforcer  during  condition  D10  (Figure  10) . 
Surprisingly,  choices  of  Subject  1732  were  ser  itive  to 
differential  delays  for  the  two  options  during  a  similar 
condition  in  Experiment  1  (Figure  4,  condition  10D) . 
Perhaps,  as  discussed  above  for  Subject  1383,  the  smaller 
food-delay  ratios  of  Experiment  2  account  for  the  failure  of 
impulsive  responding  to  develop  during  condition  D10  for 
this  subject.   The  choices  of  Subject  1855,  on  the  other 
hand,  were  insensitive  to  food  and  LED  delays  associated 
with  large-reinforcer  choices  during  the  analogous 
manipulation  of  Experiment  1  (condition  10D,  Figure  4)  and 
showed  apparently  similar  insensitivity  during  condition  D10 
of  Experiment  2.   However,  a  review  of  choice  patterns 
across  all  sessions  of  condition  D10  of  Experiment  2  (not 
shown) ,  revealed  that  this  subject  never  chose  the  small 
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reinforcer  on  the  10th  trial  of  a  block  and  therefore  did 
not  contact  the  shorter  delay  to  food  associated  with  the 
small-reinforcer  choice. 
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Figure  9:   The  sequence  of  events  following  large-  and 
small-reinforcer  choices  during  conditions  with  (bottom 
panel)  and  without  (top  panel)  a  large-reinforcer  delay. 
The  values  from  conditions  Dl  and  D10  are  shown  in 
parenthesis  above  the  exchange  period  in  the  bottom  panel 


Figure  10.   The  number  of  large-reinforcer  choices  per 
session  across  all  experimental  conditions  for  both  groups. 
Graphing  conventions  are  the  same  as  in  Figure  4 ,  except  the 
finely  striped  bars  indicate  the  first  condition  with  the 
large  reinforcer  delayed,  the  coarsely  striped  bars  indicate 
egual  10-s  delays  to  the  exchange  period  from  either  choice, 
and  the  reversed  coarse  stripes  indicate  the  key  contingency 
reversal  (condition  RED10) . 
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Figure  11.   The  proportion  of  large-reinf orcer  choices  at 
each  trial  number  of  a  block  of  trials  preceding  scheduled 
exchange  periods.   Graphing  conventions  are  as  in  Figure  5 
except  for  the  additional  symbols  and  new  experimental 
conditions  depicted  in  the  figure  keys.   The  key  for  Group 
is  located  in  the  lower  left  graph  and  for  Group  B  in  the 
upper  right  graph. 
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Figure  12 .   The  average  choice  latency  during  conditions 
where  exchange  periods  were  scheduled  after  two  or  more 
trials.   Values  are  derived  from  preferred  choice  trials 
during  the  last  10  sessions  of  each  experimental  condition. 
The  axes  are  scaled  individually  for  each  subject.   Other 
graphing  conventions  are  the  same  as  in  Figure  6  except  for 
the  different  symbol  correspondences  indicated  in  the  figure 
keys. 
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Figure  13.   The  proportion  of  large-reinforcer  choices 
predicted  by  the  matching  law  applied  to  food  reinforcement 
for  each  trial  of  a  block  of  trials  preceding  exchange 
periods.   Open  bars  represent  values  during  conditions  where 
predictions  do  not  differ  between  trials.   Under  condition 
D10,  the  coarsely  striped  bar  indicates  the  predicted  value 
for  each  of  the  first  9  trials  of  a  block  and  the  finely 
striped  bar  illustrates  the  matching-law  prediction  for  the 
10th  trial.   Error  bars  during  conditions  with  more  than  one 
trial  per  exchange  period  indicate  the  range  of  predictions 
under  all  possible  choice  patterns  for  other  trials  of  a 
block. 
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TABLE  3 

The  experimental  conditions,  order  of  exposure,  and 
number  of  sessions  for  all  subjects  in  Experiment  2. 
Group  A  conditions  are  summarized  in  the  top  panel  and 
Group  B  in  the  bottom  panel . 

Time  from  a  Choice 
Response  to  the  Exchange 
Experimental     Period  (Seconds)         Number  of 
Condition3      Large     Small  Sessions 

Bird    Bird    Bird 
Group  A  1857    747     1383 

1 

Dl 

EDI 

ED2 

ED5 

ED10 

RED10 

ED10 


Bird    Bird    Bird 
Group  B  1732    1855    753 

1 

2 

5 

10 

D10 

ED10 

RED10 

ED10 


1.5 

1.5 

27 

28 

27 

1.5 

7.5 

50 

22 

44 

9.5 

9.5 

60 

24 

28 

9.5 

9.5 

32 

34 

64 

9.5 

9.5 

47 

30 

70 

9.5 

9.5 

77 

21 

46 

9.5 

9.5 

42 

— 

— 

9.5 

9.5 

26 

— 

— 

1.5 

1.5 

28 

26 

30 

1.5 

1.5 

24 

20 

90b 

1.5 

1.5 

39 

34 

20 

1.5 

1.5 

22 

78 

30 

1.5 

7.5 

33 

22 

42 

9.5 

9.5 

26 

80 

30 

9.5 

9.5 

20 

80 

36 

9.5 

9.5 

40 

34 

106c 

89 

TABLE  3 — continued 

aThe  numbers  1,  2,  5,  and  10  refer  to  the  number  of 
trials  per  exchange  period.   The  letter  D  indicates  a 
6-s  delay  to  the  large  reinforcer  (3  LEDs) .   The  letter 
E  indicates  an  equal  delay  of  9.5  s  from  either  choice 
response  to  a  scheduled  exchange  period.   The  letter  R 
indicates  that  the  contingencies  were  reversed  for  the 
choice  keys. 

''The  choice  key  assignments  were  inadvertently 
switched  for  two  consecutive  sessions  and  performance 
was  noticeably  disrupted  afterwards.   The  phase  was 
continued  until  stability  criteria  were  met. 

Preference  cycled  between  the  large  and  small 
reinforcer  during  most  of  the  phase  without  meeting 
stability  criteria.   At  the  80th  session  a  trend  toward 
the  small  reinforcer  was  evident  and  the  phase  was 
continued  until  no  trends  were  evident  for  20 
consecutive  sessions. 
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TABLE  4 

The  relative  number  of  large-reinforcer  choices  predicted  by 
the  matching  law  for  each  trial  number  of  all  experimental 
conditions  in  Experiment  2.   Values  are  based  on  food 
reinforcement.   When  there  are  two  listings  for  the  same 
trial (s),  the  top  listing  shows  values  when  the  small 
reinforcer  is  chosen  on  all  other  trials  and  the  bottom 
listing  shows  values  when  the  large  reinforcer  is  chosen  on 
all  other  trials.   The  mean  values  displayed  are  the  average 
of  these  two  calculations  for  each  trial  and  correspond  to 
the  values  plotted  in  Figure  13.   The  food  delay  values  are 
described  in  the  text,  the  amount  of  food  values  are  the 
same  as  in  Experiment  1. 


Experimental  Large 

Condition        Trial        Large  +  Small        Mean 

.750  .750 

.667  .634 

.600 

.583  .560 

.536 

.545  .531 

.517 

.429  .429 

.545  .531 

.517 

.231  .221 

.211 
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GENERAL  DISCUSSION 

All  subjects  chose  the  larger  delayed  reinforcer  more 
often  in  Experiment  2  than  in  Experiment  1.   Self-control 
increased  in  Experiment  2,  consistent  with  predictions  of 
the  matching  law,  primarily  because  the  minimum  delays  to 
food  from  choices  were  eguated  for  both  options  during  most 
conditions.   Together,  these  experiments  confirm  many 
previous  findings  regarding  the  sensitivity  of  pigeons1 
choices  to  delays  in  food  presentation  (e.g.,  Green  et  al., 
1981;  Lea,  1979;  Logue  et  al.,  1984;  Rachlin  &  Green,  1972). 

When  food  delays  were  prevented  from  differentially 
influencing  choice,  in  the  terminal  condition  (ED10)  of 
Experiment  2  that  most  resembles  the  typical  human 
procedure,  4  of  6  subjects  (1857,  1732,  1855,  and  753) 
preferred  the  larger  delayed  reinforcer  (Figure  10) .   The 
levels  of  self-control  observed  in  Experiment  2  are 
comparable  to  those  reported  in  a  similar  study  with  humans 
(Logue  et  al.,  1986)  and  those  found  in  a  previous 
demonstration  of  self-control  with  pigeons  involving  delay- 
fading  histories  (Mazur  &  Logue,  1978) .   Also,  the 
variability,  within  and  between  subjects,  was  well  within 
the  range  characteristic  of  similar  studies  (e.g.,  Logue  & 
Pena-Correal,  1984;  Logue  et  al . ,  1984,  Experiment  1;  Logue 
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et  al.,  1986,  Experiment  1;  Mazur  &  Logue,  1978;  Rachlin  & 
Green,  1972;  van  Haaren  et  al.,  1988).   The  reliability  of 
these  effects,  in  the  4  subjects  showing  the  most  self- 
control,  was  further  established  by  reversing  the 
contingencies  on  the  choice  keys.   In  all  cases,  subjects 
continued  to  prefer  the  larger  delayed  reinforcer, 
regardless  of  the  key  with  which  it  was  associated  (Figure 
10,  conditions  ED10  and  RED10)  ruling  out  key  color  and 
position  bias  as  alternative  explanations.   This 
manipulation  was  important  because  3  of  the  4  subjects 
exposed  to  the  key  reversals  had  prolonged  recent  histories 
of  preferring  the  same  option;  moreover,  position  and/or 
color  biases  are  especially  common  with  concurrent  FR1, 
discrete-trials  procedures  like  those  used  here  (e.g.,  Logue 
&  Pena-Correal,  1984;  Logue  et  al.,  1984,  Experiment  1;  van 
Haaren  et  al.,  1988). 

The  self-control  demonstrated  in  the  present  study  may 
be  viewed  in  terms  consistent  with  Skinner's  (1953) 
treatment  of  self-control.   In  Skinner's  terms,  choice  of 
the  immediate  small  reinforcer  might  be  considered  the 
controlled  response  and  choice  of  the  delayed  larger 
reinforcer  the  controlling  response.   In  this  case,  choosing 
the  delayed  reinforcer,  as  a  form  of  self-control, 
exemplifies  the  technigue  Skinner  calls  "doing  something 
else"   (Skinner,  1953,  p.  239).   That  is,  choice  of  the 
immediate  smaller  reinforcer  is  prevented  by  the  emission  of 
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an  incompatible  response  (choice  of  the  delayed  larger 
reinforcer) .   The  process  by  which  pigeons  in  the  present 
study  came  to  exhibit  self-control  and  acquire  this 
controlling  response  is  worth  considering. 

One  possibility  is  that  choice  of  the  3  LEDs  was 
directly  reinforced  by  the  LEDs.   Despite  the  present 
finding  that  food  delays  affected  choice  more  than  did  LED 
delays,  there  are  several  reasons  to  suspect  that  the  LEDs 
did  function  as  reinforcers.   First,  because  the  training 
histories  and  LED  arrangements  in  the  present  study  closely 
resemble  the  token  reinforcer  paradigm  (Malagodi,  1967),  it 
is  likely  that  the  LEDs  functioned  as  token  reinforcers. 
Although  subjects  in  the  present  study  did  not  directly 
manipulate  the  LEDs,  as  do  subjects  in  more  typical  token 
reinforcement  studies,  it  is  not  clear  that  such  handling 
enhances  reinforcing  efficacy.   Also,  the  long  latencies 
characteristic  of  lst-trial  choices  and  latency  reductions 
once  tokens  were  present  (Experiment  1,  Figure  6  and 
Experiment  2,  Figure  12)  resemble  previous  findings  with 
token  reinforcement  (Kelleher,  1958;  Malagodi  et  al.,  1975; 
Waddell  et  al.,  1972;  Webbe  &  Malagodi,  1978).   Informal 
observations  revealed  that  all  subjects  did  occasionally 
orient  toward  the  LEDs  when  they  were  presented  and  often 
pecked  at  them  during  the  ITI  and  prior  to  exchange  periods. 
Pecking  is  often  elicited  by  conditioned  stimuli  paired  with 
food  (Schwartz  &  Gamzu,  1977) ,  stimuli  that  would  also  be 
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expected  to  have  reinforcing  properties  (Gollub,  1977) .   LED 
illumination  might  also  be  expected  to  function  as 
conditioned  reinforcement  because  the  accumulation  of  LEDs 
was  correlated  with  reductions  in  the  delay  to  food 
(Fantino,  1977) . 

Although  the  reinforcing  function  of  the  LEDs  is  not 
certain,  the  precise  function  of  the  LEDs  in  the  present 
study  is  no  more  mysterious  than  the  function  of  points 
delivered  in  similar  experiments  with  human  subjects  (e.g., 
Logue  et  al.,  1986).   Although  these  experiments  do  not 
usually  include  clear  functional  assessments  of  point 
delivery,  points  are  often  presumed  to  function  as 
reinforcers  in  humans,  even  in  the  absence  of  explicit 
instructions.   This  is  presumably  because  human  subjects 
typically  have  extensive  histories  with  points  and  numbers 
outside  of  the  laboratory.   These  histories  likely  establish 
precise  discriminations  in  humans  of  more  from  less  points 
over  a  wide  range  of  absolute  numbers  of  points.   If  points 
are  delivered  as  reinforcers,  such  histories  may  also 
enhance  sensitivity  to  the  cumulative  amount  of 
reinforcement — sensitivity  that  may  be  related  to  the 
maximization  and  self-control  often  seen  in  human  subjects 
(e.g.,  Flora  &  Pavlik,  1992;  King  &  Logue,  1987;  Mawhinney, 
1982) .   The  present  finding  of  self-control  in  subjects  that 
did  not  have  such  extensive  verbal  and  social  histories 
reveals  that  training  circumstances  provided  within  the 
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token  reinforcer  arrangement  may  be  sufficient  to  produce 
self-control. 

Previously  reported  differences  in  the  performance  of 
humans  and  pigeons  under  self-control  procedures  may 
therefore  be  the  result  of  procedural  differences,  rather 
than  the  verbal  processes  characteristic  of  humans  per  se. 
A  number  of  studies  have  documented  differences  in  self- 
control  when  different  consequences  are  arranged  (e.g., 
Logue  et  al.,  1984;  Logue  et  al.,  1986;  Navarick,  1982; 
Ragotzy  et  al.,  1988;  Solnick  et  al.,  1980).   In  conjunction 
with  the  current  study,  these  experiments  suggest  that  with 
both  humans  and  pigeons,  self-control  is  less  likely  with 
reinforcers  of  more  immediate  value,  such  as  food  and  escape 
from  noise,  but  more  likely  with  token  reinforcers.   The 
self-control  obtained  with  token  reinforcement  could  thus  be 
viewed  simply  as  a  case  of  insensitivity  to  delays  with 
certain  kinds  of  consequences.   When  this  delay 
insensitivity  is  related  to  other  characteristics  of  the 
token  arrangement,  however,  more  complex  interpretations 
emerge . 

It  is  consistent  with  the  token  reinforcement 
literature  to  discuss  the  exchange  period  as  a  reinforcer  of 
component  token  schedules  (e.g.,  Webbe  &  Malagodi,  1978) 
and,  in  the  present  study,  of  trial  choices.   In  Experiment 
1  of  the  current  study,  it  was  argued  that  quicker  access  to 
the  exchange  period  on  the  final  trial  of  a  block  following 
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small-reinforcer  choices  produced  the  impulsive  responding 
observed.   Similarly,  the  self-control  exhibited  in 
Experiment  2  might  be  interpreted  in  terms  of  reinforcement 
of  large-reinforcer  choices  on  the  final  trial  of  a  block  by 
onset  of  an  equally  delayed  exchange  period  with  a 
differentially  greater  amount  of  food.   Component  schedule 
sensitivity  to  exchange  period  food  amounts  was  shown  in  a 
study  by  Malagodi  et  al.  (1975)  in  which  rates  of  lever 
pressing  by  rats  were  inversely  related  to  the  amount  of 
food  obtained  in  the  exchange  period,  when  food  amounts  were 
manipulated  by  increasing  the  number  of  tokens  required  for 
each  food  delivery  in  the  exchange  period.   However,  the 
self-control  exhibited  by  subjects  in  Experiment  2  cannot 
easily  be  interpreted  as  simply  selection  of  large- 
reinforcer  choices  on  the  final  trial  of  a  block  by 
differentially  larger  food  amounts  during  the  exchange 
period.   For  example,  Subjects  1857,  1855,  and  753  all 
demonstrated  overall  preference  for  the  larger  delayed 
reinforcer  under  conditions  in  which  impulsive  choices 
increased  across  trials  of  a  block  (Figure  11) .   Also, 
Subjects  1857,  1732,  1855,  and  753  all  preferred  the  larger 
delayed  reinforcer  during  early  trials  of  a  block,  at  times 
remote  and  discriminable  from  the  availability  of  an 
exchange  period. 

The  self-control  found  with  token  reinforcement  in  the 
present  study  might  result  from  the  temporal  relationship 
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between  choices  for  tokens  at  one  time  and  a  terminal 
reinforcer  that  cannot  be  obtained  until  a  later  time.   In 
this  regard,  the  token  reinforcer  procedure  may  be  analogous 
to  commitment  response  procedures  (Rachlin  &  Green,  1972) . 
This  interpretation  implies  that  choices  are  controlled 
primarily  by  their  relationship  to  the  terminal  reinforcer 
obtained  at  a  later  time,  which  accounts  for  the 
insensitivity  to  token  delays.   The  predominance  of  food 
reinforcement  over  LED  reinforcement,  demonstrated 
repeatedly  in  the  present  study,  and  the  finding  that  4 
subjects  in  Experiment  2  preferred  the  option  eventually 
yielding  the  greatest  amount  of  food  regardless  of  the 
delays  in  LED  presentation,  is  also  consistent  with  this 
interpretation . 

Although  the  self-control  obtained  with  token 
reinforcer  arrangements  may  result  from  similarities  with 
commitment  response  procedures  involving  temporal  relations 
between  choices  and  reinforcement  outcomes,  there  is  an 
important  difference.   In  the  token  reinforcer  procedure, 
the  amount  of  the  terminal  reinforcer  available  during 
exchange  periods  is  an  aggregate  result  of  multiple  choices, 
made  prior  to  the  exchange  period.   But  pigeons'  choices  are 
normally  insensitive  to  events  integrated  over  entire 
sessions.   This  suggests  that  tokens  may  generate  self- 
control  by  somehow  bringing  choices  under  the  control  of 
their  aggregate  effect  on  the  amount  of  the  terminal 
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reinforcer  available  in  the  exchange  period.   Token  delivery- 
may  facilitate  choice  of  the  larger  reinforcer  in  this 
context  by  providing  a  stimulus  (number  of  tokens) 
differentially  correlated  with  deferred  choice  outcomes 
regarding  food  amounts.   The  display  of  tokens  earned  during 
experimental  sessions  corresponded  precisely  with  the 
cumulative  amount  of  food  available  during  the  subseguent 
exchange  period,  a  seemingly  ideal  arrangement  for 
engendering  this  type  of  control.   This  interpretation  is 
also  consistent  with  Logue  and  Mazur's  (1981)  finding  that 
overhead  lights  differentially  correlated  with  the  large- 
reinforcer  delay  period  facilitated  self-control  in  pigeons. 
Logue  and  Mazur  suggested  a  conditioned  reinforcing  function 
of  the  light  but  a  discriminative  function  is  more  likely. 
With  respect  to  the  present  study,  preferring  the  delayed  3 
LEDs  may  not  represent  reinforcement  by  LEDs  at  all. 
Rather,  the  LEDs  may  provide  a  more  immediate  discriminative 
basis  for  maintaining  choices  that  result  in  more  food 
during  the  exchange  period.   The  role  of  tokens  and  the 
token  display  in  improving  sensitivity  to  the  outcomes  of 
choice  on  overall  food  reinforcer  amount  could  be 
investigated  by  comparing  choice  in  a  token  reinforcer 
arrangement  with  and  without  an  ongoing  display  of  acquired 
tokens  or  by  examining  choice  in  a  similar  arrangement, 
without  the  tokens. 
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Choice  of  the  larger  delayed  reinforcer  as  a 
controlling  response  may  have  been  established  in  the 
present  study  because  of  the  presence  of  stimuli 
differentially  correlated  with  the  cumulative  outcomes  of 
choices.   In  traditional  terms,  such  stimuli  make  the 
subject  "aware"  of  the  conseguences  of  alternative  actions 
and  may  facilitate  self-control  in  a  manner  similar  to  self- 
generated  rules  reported  by  human  subjects — rules  that 
similarly  correspond  to  the  outcomes  of  alternative  choice 
options  in  relation  to  overall  obtained  reinforcement 
(Logue,  1988;  Logue  et  al.,  1986;  Mawhinney,  1988).   Such 
verbal  stimuli  are  also  used  to  engender  self-control 
outside  the  laboratory  (Skinner,  1953) .   Just  as  tokens  may 
bring  choices  under  control  of  the  amount  of  a  deferred 
terminal  reinforcer  by  providing  more  immediate  stimuli 
(tokens)  that  correspond  to  that  reinforcer,  so  might  verbal 
stimuli,  such  as  checks  on  a  list,  daily  logs  of  energy  use, 
and  weekly  weight  records,  bring  human  behavior  under  the 
control  of  respective  long-term  outcomes.   In  both  cases, 
such  stimuli  may  function  as  a  type  of  reinforcement. 
Indeed,  they  occur  response  dependently.   Their  critical 
function,  however,  even  when  they  are  chosen,  is  their 
discriminative  effect  on  behavior,  that  is  itself  important 
because  of  its  relationship  to  some  other  deferred 
reinforcer. 


APPENDIX 

A  SAMPLE  OF  FOOD  BASED  MATCHING  LAW  CALCULATIONS 

FROM  EXPERIMENT  1 


Condition      Ac  x  D,      B„     B,  +  B,  Mean 


6  x   .6      3.6       3.6 


2  X  1.8      3.6       7.2 


ID  6X.6      3.6       3.6 


2x7 

8 

15.6 

19.2 

5 
(Trial  1-4) 

14 

14 

14 

10 

10 

24 

30 

30 

30 

5 
(Trial  5) 

26 
14  X 

6 

26 
8.4 

56 
8.4 

10  X  1 

8 

18 

26.4 

30  x 

6 

18 

18 

26  x  1 

8 

46.8 

64.8 

=  .500        .500 


=  .1! 


=  .583 


=  .536 


=  .318 


=  .278 


560 


.298 


100 
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Condition      Ac  x  D,      Bc     B,  +  B„  Mean 


5D 
(Trial  1-4)   Same  as  condition  5 


=  .097 

.090 
=  .082 


(Trial  5) 

14  x   .6 

8.4 

8.4 

10  X  7.8 

78 

86.4 

30  x   .6 

18 

18 

26  x  7.8 

202.8 

220.8 

Note.     The  calculations  for  the  first  4  trials  of 
condition  5  are  based  solely  on  the  amount  of  reinforcement 
ratios  for  the  two  options  because  the  delays  to  food  are 
equal  except  on  the  final  trial  of  a  block.   Two 
calculations  are  shown  for  trials  1-4  and  two  for  trial  5. 
The  first  calculation  assumes  that  only  the  small  reinforcer 
is  chosen  on  the  other  trials  of  a  block.   The  second 
calculation  assumes  that  only  the  large  reinforcer  is  chosen 
on  other  trials.   The  mean  of  these  two  calculations  was 
used  in  plotting  the  predictions. 
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